davisking / dlib

A toolkit for making real world machine learning and data analysis applications in C++
http://dlib.net
Boost Software License 1.0
13.58k stars 3.38k forks source link

Speed using cnn_face_detector #891

Closed christinabo closed 6 years ago

christinabo commented 7 years ago

Hello community! First, I would like to thank you for this helpful library.

I'm using the cnn_face_detector with Python 3.5.2, however it takes 48-60 seconds for performing face detection on a single image. Is this a normal speed for the cnn implementation? My images' size is 2592x1936. I'm also running cnn_face_detector(image, 0) with 0 instead of 1 to avoid upscaling (I hope it makes sense).

I'm using Ubuntu 16.04 and I tried to compile dlib using avx instructions and Cuda. My steps were:

  1. Download the dlib repo
  2. Run: python setup.py install --yes USE_AVX_INSTRUCTIONS --yes DLIB_USE_CUDA
  3. Move my .py file into python-examples and run it there No errors/problems with these. Are these steps sufficient to compile it and run my python code with avx/cuda? I feel that I might be missing something, as running my code there drops the time to 46-60 seconds, just by two seconds, which could actually be a chance event.

Thank you in advance for any help!

davisking commented 7 years ago

You should either install the Intel MKL (or OpenBLAS) to run on the CPU or install CUDA. You definitely aren't using CUDA if you are getting this speed. Read the output of the install step. It will tell you what's happening and say something about how it didn't find CUDA because it's not installed or something to that effect.

christinabo commented 7 years ago

I have CUDA installed and it's found by the system. However, OpenBLAS was not and once I installed it (and re-run the python setup), it dropped to 16secs/image. I guess this is a significant difference. Thank you for helping!

davisking commented 7 years ago

It's obviously not using CUDA. Read the output of the python setup step. It will say it's not using CUDA. That's what you need to address.

zzw1123 commented 7 years ago

@davisking How can I check whether the program is using CUDA ? and how can I tell the program which GPU I am going to run on?

zzw1123 commented 7 years ago

@christinabo could you please tell me why you move your .py file into python-examples and run it there? The same problems occurs to me,and I install cuda successfully....so I dont know how to increase the speed and dont know how to use cuda...Thanks a lot!

christinabo commented 7 years ago

Hi @zzw1123, first of all just to note that this is my very first try to work with cuda, so not sure exactly how all this works. Can I ask, what is the speed you achieve on your images? Previously, I've installed dlib with pip but as far as I understand to use it with cuda, you need to compile it by yourself. I imagine that this is kind of a local installation and that the example files should work with it, so I just moved it there for testing purposes. I guess there should be a way to install it for the whole environment though.

When I'm running the setup thing, I get this message

Found CUDA: /usr/local/cuda (found suitable version "9.0", minimum required is "7.5")

so I assume cuda is here. I don't really understand where to look. So, if I don't have cuda according to @davisking, what made the speed drop from 48secs to 16secs? And what is the expected speed I should have?

davisking commented 7 years ago

What else does it say? Post the entire output. It should contain some very clear language telling you about it's use of CUDA.

zzw1123 commented 7 years ago

@davisking after I type this command "python setup.py install --yes DLIB_USE_CUDA",it seems that the cuda is installed successfully,and at the end of the output ,it shows as follows: Installed /home/zzw/.local/lib/python2.7/site-packages/dlib-19.7.99-py2.7-linux-x86_64.egg Processing dependencies for dlib==19.7.99 Finished processing dependencies for dlib==19.7.99

and I use dlib for a face_recognition project whose model is dlib.cnn_face_detection_model_v1.The code is here"https://github.com/ageitgey/face_recognition/blob/master/examples/find_faces_in_batches.py",but as I post before,the speed is slow and I dont know whether CUDA is working..... and the code of the link above doesn't show some results of using of CUDA which also confuses me... can you please help me?

zzw1123 commented 7 years ago

@christinabo the speed of your code dropped from 48s to 16s after the openblas was installed? It may be about 2fps in terms of the speed...I don't test it precisely using some functions...

zzw1123 commented 7 years ago

@davisking @christinabo Oh...I use the command "htop" to see whether the program is running on our server,and I see my program is running there....so the cuda is being used... thanks anyway~~~

christinabo commented 7 years ago

@davisking Here is the whole output

@zzw1123 Yes, it happened after installing OpenBLAS. I have similar output with you.

davisking commented 7 years ago

Oops, the error messages from cmake aren't printing in your case. Try pulling the latest dlib from github. It should give a much clearer picture of what's going on :)

dlib-issue-bot commented 6 years ago

Warning: this issue has been inactive for 320 days and will be automatically closed on 2018-09-07 if there is no further activity.

If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.

doublefish23 commented 6 years ago

@zzw1123, Right now I am also struggling on offloading computations from CPU to GPU. However, as I was looking trough the source code, it appears to me that GPU computations are not supported for processing through batches. In other words, when you are doing face recognition from a video file, you have to do it on CPU as for now.

davisking commented 6 years ago

You can definitely use a GPU for video processing.

dlib-issue-bot commented 6 years ago

Warning: this issue has been inactive for 20 days and will be automatically closed on 2018-10-14 if there is no further activity.

If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.

dlib-issue-bot commented 6 years ago

Warning: this issue has been inactive for 28 days and will be automatically closed on 2018-10-14 if there is no further activity.

If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.

tapas commented 6 years ago

@davisking What about using face detector for first frame ans using correlation tracker(for futher face detection) on remaining frames ??

davisking commented 6 years ago

Yes, you could do something like that. Or just don't run every frame, or use face landmarks between frames for a little while. There are many options.

dlib-issue-bot commented 6 years ago

Warning: this issue has been inactive for 30 days and will be automatically closed on 2018-11-13 if there is no further activity.

If you are waiting for a response but haven't received one it's likely your question is somehow inappropriate. E.g. you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's documentation, or a Google search.

dlib-issue-bot commented 6 years ago

Notice: this issue has been closed because it has been inactive for 36 days. You may reopen this issue if it has been closed in error.