matiasdelellis / facerecognition

Nextcloud app that implement a basic facial recognition system.
GNU Affero General Public License v3.0
514 stars 45 forks source link

Seems that the detector can return a duplicate face #271

Open matiasdelellis opened 4 years ago

matiasdelellis commented 4 years ago

Expected behaviour

That the application detects one face for each real face. Faces may be missing, which is acceptable, but never a face detected twice.

Actual behaviour

Very rarely, the detector detects the same face twice.

Steps to reproduce

  1. Add photos.
  2. Run background task.
  3. Evaluate faces.

Example

imagen

Rajesh detected twice.. :disappointed:

I added the photo again several times and I cannot reproduce it. First I assumed it was an error in SQL queries, but by evaluating the database directly, I can confirm that it is a single analysis of this photo.

id image person left right top bottom confidence
80695 68612 110996 925 1003 291 369 1.0780631303787
80696 68612 109205 520 598 196 274 1.0709439516068
80697 68612 110790 211 289 307 385 0.61323189735413
80698 68612 109543 909 1045 250 385 0.33338499069214

Specifically you can see these that are Rajesh.

id image person left right top bottom confidence
80695 68612 110996 925 1003 291 369 1.0780631303787
80698 68612 109543 909 1045 250 385 0.33338499069214

You can see that the rectangle is very similar, but the confidence is much less.

Well, This does not stop the next release, because the duplicate face have low confidenc and therefore it is not grouped, but we considerear this problem.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/91959659-seems-that-the-detector-can-return-a-duplicate-face?utm_campaign=plugin&utm_content=tracker%2F74944432&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F74944432&utm_medium=issues&utm_source=github).
matiasdelellis commented 4 years ago

imagen

The face with the most confidence is the small ..

matiasdelellis commented 4 years ago

Landmakrs of face with high confidence:

imagen

Same face with less confidence..

imagen

stalker314314 commented 4 years ago

Interesting to note is that Rajesh 80695 was completely inside Rajesh 80698. Maybe if it is always like this, we can exclude those faces immediately after returning from face detection in dlib.

matiasdelellis commented 4 years ago

In principle yes. It should be evaluated directly when detect the faces and discard it, but I think it is unlikely that always be one inside the other. Surely there are some cases where only 50% of the rectangles is shared between both faces, and in these cases probably also they will be duplicates.

As the duplicate face that has bad landmarks, also has low confidence, at least with this single example for now I am not worried.

I will try to do something automatic to find all the cases, and then evaluate, but it will be for another day.. :wink:

p.s: This also opens the door to reanalyze the photos. First with small images to get results quickly, and then gradually increase the image size to get new faces, and descriptors with more quality. Or when you change models, take the names from the previous model. :open_mouth:

matiasdelellis commented 4 years ago

These are all cases where at least one pixel overlap. The first number is the percentage of overlap, and the others the confidence of both faces.

There are only two cases where one face is completely inside the other. In this case, it seems like it can be discarded directly. (Overlap > 30% and low confidence).

When the overlap is less than 10 percent, they seem reliable, but still we must consider the confidence when clustering.

The other cases, it is more difficult to discern. however, the confidence is also less than 0.99, and therefore they will not clustered.

Faces With landmaks
imagen imagen
matiasdelellis commented 4 years ago

Code used: https://github.com/matiasdelellis/facerecognition/compare/evaluate-overlap

matiasdelellis commented 4 years ago

imagen In my case, I have a single photo with these conditions, (Let's ignore that I look crazy, lack of context. haha). But in this case, unlike the previous ones the most reliable face is the largest.

stalker314314 commented 4 years ago

Seems like there is no easy pattern (at least I cannot think of any) to heurstically find it. But why does this bother you, can you just brush it off as "neural network intricacies"? What is worst case? You get same face two times, both times with high confidence even, let's say (high enough that it is included in grouping). After grouping is done, you can just ignore one of these faces in JS (pick one with higher confidence maybe). Other than being weird and shameful bug (even that is too stretch to say:), is there any other logical problem with this (breaks clustering logic, requires rethinking clusters-person relations...)?

Also, do you see this in python dlib too (asking just to see if there is a bug in pdlib)?

matiasdelellis commented 4 years ago

It doesn't bother me, but it's something to improve. :wink:

I was worried when I saw it the first time, but now that check that it is very strange, and in any case, the second face is not clustered due its low confidence, i think we can live with it.

Other than being weird and shameful bug (even that is too stretch to say:), is there any other logical problem with this (breaks clustering logic, requires rethinking clusters-person relations...)?

It does not break absolutely anything because the second face has low confidence and is never clustered.

Just that is confusing see the same face twice on the sidebar.

Also, do you see this in python dlib too (asking just to see if there is a bug in pdlib)?

No, but the PDlib code is practically a copy of the python bindings. It should be similar.

After grouping is done, you can just ignore one of these faces in JS (pick one with higher confidence maybe).

Following the principle, we don't try to be better than google, In google photos, it only shows the most reliable faces, and allows you to add others.

imagen

In this example Google Photos only shows 5 faces by default, and allows adding Howard but no detect to Penny,

imagen

We instead, we are showing 7, and one is garbage..

This is a different problem (And this again has any clustering problem.), but the resolution is the same. Show only trusted faces, and allow adding others.

amo13 commented 1 year ago

Also noticed that a person can be recognized twice in a single image. Happened quite a few times on my family photos...