Failing to detect faces in selfies

takuya-takeuchi / FaceRecognitionDotNet

The world's simplest facial recognition api for .NET on Windows, MacOS and Linux

MIT License

1.23k stars 300 forks source link

Failing to detect faces in selfies #183

Open gillonba opened 3 years ago

gillonba commented 3 years ago

Is there a maximum image size to reliably detect faces? I have an application that has been working quite well, but I recently tried to start processing selfies captured from a phone camera and I can't seem to detect any faces, and I wonder if the face is just too large. I don't recall the exact resolution of the images being captured, but it would definitely be much larger than what I had been using before. Is there a known maximum image size to reliably identify faces? I am using HOG, would CNN work better? I hope that isn't the solution because I am running it on a machine that doesn't support CUDA. Resizing the images would be a much better solution for me I think. I'm really liking this library otherwise!

takuya-takeuchi commented 3 years ago

@gillonba

Is there a maximum image size to reliably detect faces?

No. Basically it depends on face size on image rather than image size. So it could be easy to detect large face occupies in image.

HOG could not be depend on image and face size. HOG is Histograms of Oriented Gradients and it does not resize image when processing unlike CNN. http://dlib.net/face_detector.py.html

If you want to detect only face except for recognize face, you can use

These libraries depend on NcnnDotNet. They can detect small faces.

gillonba commented 2 years ago

Ok. I wondered if there was something specific to phone selfies because I have had pretty good luck otherwise. Ultimately the point of the project is face recognition, so just detection is not good enough. Would it work to look into an improved face detection algorithm? If I remember correctly, I call FaceLocations() followed by FaceEncodings(), so would it maybe work to use some other library to detect the face and still use FaceEncodings() to generate an encoding? Or is it too fragile to work if the bounding rectangle isn't just right?

takuya-takeuchi commented 2 years ago

@gillonba

If I remember correctly, I call FaceLocations() followed by FaceEncodings(), so would it maybe work to use some other library to detect the face and still use FaceEncodings() to generate an encoding?

You are correct. FRDN can accept face location to extract face encoding. So you can pass face location which be retrieved from other libraries. But I have not tried yet.

However, we must consider to keep low FAR and FRR. It is meaningless even though library generate encoding datum from small face.

You can integrate CustomFaceDetector to FRDN. You implement class from https://github.com/takuya-takeuchi/FaceRecognitionDotNet/blob/master/src/FaceRecognitionDotNet/Extensions/FaceDetector.cs. It is easy to try other library.

gillonba commented 2 years ago

Thanks for the suggestion. I bit the bullet and got CUDA working and CNN seems to be working quite well, though I'm having trouble with high-resolution images. I guess I need to scale them down myself before passing them in to the locator?

I also tried UltraFace and while the speed is good and I have no problems getting the encodings from the locations it provides, I am getting a ton of false positives. I assume I can play with the ScoreThreshold but at this point it seems to have little advantage for me over CNN. It is really good to know that I can use other libraries though, and I'll keep it in mind in case I ever start playing with mobile. I'll post a test comparing all three methods once I have it refined a bit

Have you looked into Microsoft's ML.Net? It would be great to use a library built for C# and officially supported instead of dealing with a wrapper

takuya-takeuchi commented 2 years ago

good!!

I guess I need to scale them down myself before passing them in to the locator?

CNN has convolution layer and we need not to take care of scaling image. However, small faces in picture will be disappeared when input to cnn. So we should crop area and pass cropped image to cnn. It may be better result.

UltraFace and while the speed is good and I have no problems getting the encodings from the locations it provides, I am getting a ton of false positives

Small face may not be able to catch important feature when encoding. But this issue is commonly. It is hard to resolve it.

gillonba commented 2 years ago

So as promised, here is a test video showing the results from HOG, CNN, and UltraFace. In this case, CNN (with GPU) is the clear winner, being slightly faster than UltraFace and with fewer false positives. Now that the test program is set up, I can easily run it against other video clips if anyone wants to see it run against anything else.

https://rumble.com/vpghpz-alita-trailer-1-with-face-recognition.html