jakowenko / double-take

Unified UI and API for processing and training images for facial recognition.
https://hub.docker.com/r/jakowenko/double-take
MIT License
1.23k stars 98 forks source link

How much do you train? #65

Closed alekslyse closed 3 years ago

alekslyse commented 3 years ago

Not an issue, but I started retraining after in the start train every photo, even the tiny small ones. Do the detection get better if I just train the pictures where the face is very visual in the frame, front, and side, and skip those small pictures?

jakowenko commented 3 years ago

I would suggest to only use high quality images for training. On mobile the upload button on the train tab allows you to access your phones camera. I've been taking selfies for images of myself and using other high quality photos for friends.

Try not to use ones with tiny faces, that will probably lead to more false positives.

alekslyse commented 3 years ago

That is interesting, as 99% of the photos taken by a surveillance camera cant really be called high res. I would suggest this being a tip in the documentation for initial training. Maybe it would be possible to implement a video training, where you could go around a person in a 360 degree with a high res camera (ex new phone), and double take extract every frame and train for that person - that would maybe train very well.

So you do not generally train from the surveillance? In my experience just a few, very few you can see the face clearly, though being recognized, again though very easy to mismatch with for example my brother who looks similar.

Im running all three image detection services so its a bit hit and miss between them

ozett commented 3 years ago

Maybe a more precise question should be: What Train-Fotos are needed for best recognition results? Size, Quality, Orientation, Light., count/number ...?

jakowenko commented 3 years ago

Hey @ozett, it's hard to say exactly because all of that depends on the detectors you are using.

Someone in a previous issue told me they read this about CompreFace, so it's probably a good general idea.

I read a statement from one of the CompreFace developers on reddit (cant find the source now though) that the neural network at least in CompreFace use a resolution of 160x160 so the face should at least be of that size, but the higher the better up to about 640x480 as maximum if I recall.

I personally have around ~30 images of myself. I use the train page on the UI and upload selfies of myself with my phone. This way I get high resolution images of my face, which should provide the best results. All the detectors should crop / orientate the photo, but trying to use one with as much of your face visible is probably best.

ozett commented 3 years ago

thanks for reply.

i will check for faces (not whole images) bigger than 160x160. what is minimum count for good training results, 30 images per person?

is there also a best practice advice for the face orientation to get better training results: means different shooting angles, ie. up/down, side, front, a bit inclined, lack/white.... ?

what helps the compreface-detector? Or do i have to experiment a bit?

ozett commented 3 years ago

previously i had some simple custom-code running with media-pipe detection for a face-mesh. this was triggert by motion-detection from the cam for a fixed time-period. now i have all this short videos with people. may i use the face-mesh hits to extract faces and use them for training... !?!

edit: (green confirmation... may i go this road?)

image

georgbachmann commented 8 months ago

I also wanted to know if there is a way to tell e.g. compreface that when it is 98% sure it's be, but it's not me? Like train it on faces that I am NOT?