Open fottofatto opened 6 years ago
I think you need an SSD.
I think you need an SSD.
Do you think that there is a disk bound? Does it might be related to dlib? What you say is that If I get SSD, I got performance of multiple processes by only running one, right? Because what I want is maximum utilization. Thanks for your help.
The small files IO performance is ten times faster with SDD. You can check the iowait on "top" command, must sure that is not too high.
System properties
Memory: 32 GB
CPU: Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz
OS: Ubuntu 16.04.5 LTS 64-Bit
GPU: GTX1080 Ti 11 GB
NVIDIA-SMI 410.48
Driver Version: 410.48
Dlib Version: 19.15.0
Face Recognition Version: 1.2.3
Photo size: 240x320
Description
I have bunch of photos which I want to get the encodings and to find distance with new photos. I am trying to use the face detection feature with batch functions. Using the cnn model and gpu it took 32 seconds to get the face detection of 4096 photos which corresponds to 128 photos per second and I am sure dlib is using cuda and gpu. Nvidia-smi output is below and dlib.DLIB_USE_CUDA outputs True. I have slightly removed the hog parts to simplfy it. Batch size is 32. If I changed the batch size up to 128, there is almost no change. The only change is the gpu memory usage. If I increase the batch size gpu memory usage increases.
If I run multiple instance of this code with different photo directory simultaneously let's say five copies (I mean five different processes on different tabs), the number of photos detected per second may scale up to almost 300 photos.
So, my question is that is it the maximum performance (128 photos per second for 240x320 sized photos with this setup) I can get? How can I maximize my performance up to 300 photos per second with only running one process? In my opinion somehow I should reach that performance with only running one instance, but I don't know how to do that. Which parameters should I change?
What I Did
Here is my code:
Code Output:
nvidia-smi Output:
It starts with 195 MB memory then it reaches to 2005 MB and then it finishes.