HOG detect face is more accurate than CNN

davisking / dlib

A toolkit for making real world machine learning and data analysis applications in C++

http://dlib.net

Boost Software License 1.0

13.58k stars 3.38k forks source link

HOG detect face is more accurate than CNN #878

Closed starzxzx closed 7 years ago

starzxzx commented 7 years ago

when I use a pic from this link test face detector both of HOG and CNN The result is CNN detect the hand as a face; but HOG is correct。 The img which detected by CNN lack the chin。

https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec=1507896274082&di=824f7f59943a71e2e9904d22175ce92c&imgtype=0&src=http%3A%2F%2Fwww.moontalk.com.tw%2Fupload%2Fimages%2F20160606angelina-03.jpg

davisking commented 7 years ago

When I run the CNN detector on that image it works perfectly. What CNN are you talking about? I ran this example program on that image http://dlib.net/dnn_mmod_face_detection_ex.cpp.html.

starzxzx commented 7 years ago

I run the detector in Python environment. Install dlib via pip and the version is 19.7.0 . The same problem occured when call the method cnn_face_detector. I call dlib via this project https://github.com/ageitgey/face_recognition/blob/master/examples/find_faces_in_picture_cnn.py

davisking commented 7 years ago

Use the python examples that come with dlib.

starzxzx commented 7 years ago

OK，I will try.

starzxzx commented 7 years ago

I run cnn_face_detector example.The result is correct. http://dlib.net/cnn_face_detector.py.html Thank you

davisking commented 7 years ago

@ageitgey you might want to look at this. There might be something wrong with your face detection wrapper.

ageitgey commented 7 years ago

@davisking Thanks. I'll take a look over at https://github.com/ageitgey/face_recognition/issues/209 and update it there.

ageitgey commented 7 years ago

@davisking If you change the dlib example http://dlib.net/cnn_face_detector.py.html to upsample 0 times (instead of the default 1), it detects the hand as an additional face with a very low confidence:

    # Pass in 0 instead of 1 for upsampling
    dets = cnn_face_detector(img, 0)

Result:

Processing file: testface.jpg
Number of faces detected: 2
Detection 0: Left: 258 Top: 211 Right: 494 Bottom: 446 Confidence: 1.072353720664978
Detection 1: Left: 357 Top: 558 Right: 521 Bottom: 722 Confidence: 0.042258620262145996

The demo file at https://github.com/ageitgey/face_recognition/blob/master/examples/find_faces_in_picture_cnn.py he was using just defaulted to zero upsampling instead of 1 - explaining the difference result you were seeing.

So dlib really is detecting the hand as a face with low confidence with this specific image and upsampling setting. I don't think that's a 'bug' per se, but just sharing the result.

davisking commented 7 years ago

Yeah, in that case then it's all good. Thanks for checking.

ageitgey commented 7 years ago

Sure, no prob.

7thstorm commented 5 years ago

does dlib return the actual number of landmarks detected? For example, if a face is slightly turned, it will probably not be able to detect all 68, and will detect say 50. will it tell me how many landmarks found? I have in code to ignore any detections of 45 or less, but no matter what dlib always returns 69 landmarks even on sideways faces. Is this a dlib issue? has anyone else faced this?

davisking commented 5 years ago

The shape_predictor always outputs the same number of landmarks, no matter what. It isn't detecting if they are there or not, it's finding the best way to deform the shape to the object present in the image. If you want to know if a particular landmark is occluded or not you need to make some additional model to do that test.

It should be noted that you can easily do this by training a linear svm on the same features the shape_predictor uses. They are available in the feats output of the shape_predictor. That is, you train a binary classifier for each landmark to predict if it is occluded or not. I've done this and it works fine, which is why the feats output is available in the API.