ageitgey / face_recognition

The world's simplest facial recognition api for Python and the command line
MIT License
52.79k stars 13.42k forks source link

Knn classifier #655

Open RachelBaird opened 5 years ago

RachelBaird commented 5 years ago

Let's say I am using face recognition for security purposes. Then if someone wants to access , we take his pic and find the encodings and have to match them up against all of the encodings in database?

If we have a knn classifier with encodings in the database , everytime a new user registers shouldn't we recreate knn ?

How fast is knn compared to doing all the n matchings possible? (if there are n encodings in the databse)

ageitgey commented 5 years ago

Then if someone wants to access , we take his pic and find the encodings and have to match them up against all of the encodings in database?

Right.

If we have a knn classifier with encodings in the database , everytime a new user registers shouldn't we recreate knn ?

Yes. You can also use any other kind of classifier model, like a Support Vector classifier instead. Different models may give you better or worse performance for your specific use case.

How fast is knn compared to doing all the n matchings possible? (if there are n encodings in the databse)

You would just have to test it. There are too many variables with how you set up and design the database, what kind of database service you are using, how efficient your code is, what implementation of the machine learning model you are using, how many users you have, etc, to tell you for sure.

For example, you might prototype a solution using scikit-learn's KNN model but move to a large-scale knn implementation like annoy depending on your exact needs, but that would change the performance characteristics.

andyrey commented 5 years ago

Hi, dear participants, continuing this topic. As I understand the process: in my database I have N person-folders with 5 face encodings in each. When someone trying to get access, his face image has been encoded and this face vector (128 digits) compared to each of N*5 face vectors in the database.
Then, using knn principle, say, k=3, I accept that result (person-folder), where the number of closest encodings (with distance< threshold=0.6) was maximal , right? Ok, could you explain, please, why we need knn model training, what difference, advantage we get against my above speculation?