serengil / retinaface

RetinaFace: Deep Face Detection Library for Python
https://www.youtube.com/watch?v=Wm1DucuQk70&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=3
MIT License
1.15k stars 150 forks source link

Is there a way to increase the GPU usage/overall performance, possibly with batch input? #105

Closed grishagr closed 2 months ago

grishagr commented 2 months ago

Description

I am running RetinaFace with tensorflow 2.15, (had issues with newer version) on mac M2 (8 cores CPU and GPU). I tried to use concurrent.futures to parallelize extraction of faces, but the program crashes even if it is doing 2 images at the same time. Concurrent futures work when using only CPU, but currently I am processing images with GPU one by one and it is still outperforming 8 parallel threads on CPU. However, the activity monitor indicates that it is only using at most around 60% of GPU, and I feel like it could be even faster. Thanks!

Additional Info

No response

serengil commented 2 months ago

please raise tickets for enhancement requests under this label. you should ask your questions on stackoverflow an mentioned in the issue guidelines.

Raghucharan16 commented 2 months ago

@serengil as from tensorflow 2.1 tensorflow-gpu's development got stopped and merged in tensorflow itself so now how can we leverage gpu with mere tensorflow. do we need to push the code to cuda or something??