Similarity is above 50% with 2 different persons

luicfrr commented 7 months ago

Issue Description I've created this codesandbox with my current workspace showing my problem. If you look into selfies folder you'll see 2 selfies and those are very different persons. When I compare them, similarity result is above 50% and as docs says:

... results to similarity above 0.5 can be considered a match

In my real app I'm saving all face descriptors if face similarity is above minimum of 65%. While doing some detection tests I've tryed comparing person2 with all descriptors saved from person1 and similarity was above 77%. I'm using human.match.find method to compare all saved descriptors.

This is not an error but maybe you can help me on how to increase similarity confidence because, as you can see, these persons are really different persons.

Steps to Reproduce On codesandbox type yarn start on terminal and read logs

Expected Behavior Similarity should be something like 30% or less

Environment

Human library version? 3.1.2
Built-in demo or custom code?
Type of module used (e.g. js, esm, esm-nobundle)? js
TensorFlow/JS version (if not using bundled module)? 4.13.0
Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 89)? NodeJS 20
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10)? MacOs 12.7 and CodeSandbox
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.)?
Framework (if any) (e.g. React, NextJS, etc.)?

Thanks for your help

Donymak commented 7 months ago

In my comment, I want to share what I found out about face matching in Humans and in general.

To be honest the faces are similar, but the persons are different. Usually models do not take into account gender, hair and anything else exept the FACE itself. Getting high match scores is ok in this case. Taking into count that the model for creating face embeddings is not the strongest and the most precise in the world)

The face matching can be affected by:

Face angle
Emotions
Image scale and resolution(dimensions)

The last factor in your case can slightly affect the creation of face embeddings but not more than 0.05 as it happened usually during my tests on different images.

Also I noticed that cropping an image to the bounding box of the face on the image and using that image to get embeddings can sometimes increase accuracy although the library should do it under the hood.

Human: Have a look at the different options for face comparasing using Human on extended code sandbox fork.

As you can see there, removing the order option from the match function changes the similarity score to 0.45 and it is not a match anymore)

For my project, I considered using cosine similarity to compare face embeddings instead of Euclidean or Manhattan Distance based similarity metrics. In the code sandbox I installed the package with this funciton.

In general: Also, it was interesting for me to test images using different tech, here is what I got.

AWS Rekognition: Uncompressed: "Similarity": 0.7741245627403259, Compressed: "Similarity": 0.8348438739776611, In both cases, Amazon does not match the result, but the similarity value is pretty high.

Python FaceNet with vggface2 model:

Cosine similarity between img1 and img2: 0.12417466938495636
Cosine similarity between img1 and img3: 0.12017473578453064

The similarity methics from this model are percise as it was trained on 3.3M images and model is the best i found as free open source.

The most "percise" hovewer is the Regula Face SDK Demo with score of "similarity": 0.0084031 But it may use additional normalisation to move all the results to extra range as 0-0.1 for not match and 0.9-0.999 for match based on the different similarity value that comes out of private model.

luicfrr commented 7 months ago

@Donymak Thanks for your amazing explanation.

I was doing more internal tests I decide to try vlad's face-api package and for my surprise, similarity level lowered to 20% with all default settings and exactly same images.

I don't know if this happens because of models used on face-api or if is because it uses only 68-point face mesh while human uses 468-point but comparing both results face-api seems to be more precise on comparing two faces.

vladmandic commented 7 months ago

@Donymak thanks for the write-up, its pretty much spot on.

one correction - number of face-points in mesh (68 vs 468) is not related to face recognition at all, those are separate models.

one thing to add is that different face recognition models have different levels of sensitivity to crop factor and face rotation angle and also prefer a very different values to start with. in some cases, those can be quite significant.

@nonam4 did you try using different face regnonition modules in human? there are several supported. default one is just default because its light enough to run without major impact, but there are better ones - its always a tradeoff between performance and precision.

luicfrr commented 7 months ago

@vladmandic No I didn't tested with other models. Based on your knowledge, do you have any suggestions on which model I should test first?

vladmandic commented 7 months ago

i believe insightface is most precise, but also most sensitive to cropping and preprocessing in general.

vladmandic / human

Similarity is above 50% with 2 different persons #401