ageitgey / face_recognition

The world's simplest facial recognition api for Python and the command line
MIT License
53.12k stars 13.46k forks source link

Anyone using this with a lot of known faces? #187

Closed AlainPilon closed 7 years ago

AlainPilon commented 7 years ago

And by a lot, I mean over 1000++ known faces, any advices?

I am about to start a prototype using this library so any insight about using this at scale would be great!

lolstatsguy commented 7 years ago

yes. i use it with 10k pictures. Speed is not an issue. Accuracy is. I am trying various clustering options at the moment. if you only use a single picture per person it is impossible to get it to 85%>. Also pictures tend to be most unreliable if the person is between 18-24 years old.

AlainPilon commented 7 years ago

Thanks,

obviously, you are right, accuracy is the most important but I was worried about the memory space to store the 1000s of encoded faces. I checked after posting this ticket and at about 1120 byte per face, it should not become an issue anytime soon. Thx

arasharchor commented 7 years ago

@AlainPilon @lolstatsguy for me also storage is not a problem, the problem is only mean squared error does not give good results. How would clustering help in differentiating unknown people from known people? I mean how to put unknown faces in an unknown cluster? In this library, only one image is used while we can get more images from known people, but still, I do not know how to recognize a new image from a known person.

AlainPilon commented 7 years ago

I am curious, how bad are your results? Because for me, I am clearly over 90% with my initial tests. Do you have an example dataset?

You can put in your list of known faces as many time the same person as you want. That if what I do, whenever an unknown face is later manually ID, I append the encoding to the list of known faces with the same user_id. I doubt trying to do averages or any aggregation function on the encodings would yield anything good.

ageitgey commented 7 years ago

@smajida It's entirely possible that you'll get great results for one data set and worse results for a data set based on the age of the people in the photos, how similar the people are to the people in the original model training set, etc, etc. So without knowing more about the specific data you are classifying, it's hard to say anything helpful.