Open drakorg opened 4 years ago
You can use the model as 'hog'. It might be faster but I am not entirely sure.
Hi, no, you missed the point.
I'm not looking for alternatives to the cnn detector, I'm just trying to figure out why would the face_recognition lib, which under the hood uses dlib (which was compiled for CUDA, which is present, enabled and even being used according to GPU monitoring tool jtop when running my app), would take 2 full seconds when running in a jetpack 4.4, when on jetpack 4.3 it would take 500ms for the same input, and as far as I know, the same setup.
Since I posted the original question I went back to jetpack 4.3, and as I was saying, throughput is exactly as expected, 500ms per frame. Same input image, same application, same configuration.
I would have expected not to see GPU activity on 4.4 (that would explain why it takes longer), but in fact there is GPU activity (100%), and it's still taking 4x the time it takes on 4.3. I'm just dazzled and trying to find an explanation for it.
On Sat, 2 May 2020 at 01:38, Udit Bansal notifications@github.com wrote:
You can use the model as 'hog'. It might be faster but I am not entirely sure.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ageitgey/face_recognition/issues/1130#issuecomment-622667790, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK2RE53WXIGZIH22KAIYTLRPOPVDANCNFSM4MXFIVDQ .
Same issue for me
Hi, @mariusmotea, did you have any luck identifying the cause of the increase in the processing times?
I just follow the tutorial from here. Albeit the tutorial is not that old, it is recommending JetPack 4.2.
Description
Hi. I've been using the face_recognition library for some time under jetson nano Jetpack 4.3, having a performance of around 500 ms per frame on a 1280 x 720 image using the cnn model, with CUDA support on dlib, and everything working great.
Last night I decided to try Jetpack 4.4 and everything went fine until I saw the performance of the running process. It was around 2000 ms per frame, with the very same setup as before.
The first thing I suspected was that dlib, for some reason, may have not been compiled with CUDA support, but no, that was not the problem, as you can see below.
Not only that, using jtop I can verify that when running the model GPU usage jumps to almost 100% instantly, meaning that the GPU is actually being used. However, the time it takes to process every frame is around 2 full seconds, a lot compared to the 500 ms I was getting just yesterday when running on Jetpack 4.3.
I've run out of ideas on where to look for the problem. Any ideas?
Thank you.