matiasdelellis / facerecognition

Nextcloud app that implement a basic facial recognition system.
GNU Affero General Public License v3.0
514 stars 45 forks source link

Clustering doesn't seem to be working #551

Open KoMa1012 opened 2 years ago

KoMa1012 commented 2 years ago

Hey Guys,

I'm running into an issue with my clustering (at least it seems so). I've got a lot of pictures, it's beeing detected that there are faces but mostly it does not detect that this person has already been seen.

Expected behaviour

Detect and cluster the "all" (at least 90%) faces of my friends to the correct person.

Actual behaviour

2331 faces with 1530 persons (I don't recall that many people at my Birthday party). Often a face is detected, but it's a cluster itself (so not other simmilar face detected), even if the picture has been taken twice in the same angle and with only minimal change.

when running occ face: backgroud_job there is a message I would not expect being there:

Skipping cluster creation, not enough data (yet) collected. For cluster creation, you need either one of the following:
        * have 1000 faces already processed
        * or you need to have 95% of you images processed

There are more than 1000 faces (with one user, the other use of course has 0) and there are more than 95% of my images processed. Also it seems to me, that some clustering has been done already, since I find e.g. ~100 pictures of myself, ~200 poictures of my wife, ~30 of my sister etc... (but I always had to combine some faces myself, for me I had to tell the app that this is me probably 20 times)

Steps to reproduce

  1. Upload my 3694 pictures to nextcloud
  2. occ face:backgroundjob....
  3. the long wait
  4. see persons in the UI
  5. combine persons (so pictures of the same person which have not been clustered) in the UI
  6. occ face:stats

Server configuration

Client configuration

Logs

Background task log with debug.

occ face:background_job

1/10 - Executing task CheckRequirementsTask (Check all requirements)
2/10 - Executing task CheckCronTask (Check that service is started from either cron or from command)
3/10 - Executing task LockTask (Acquire lock so that only one background task can run)
4/10 - Executing task DisabledUserRemovalTask (Purge all the information of a user when disable the analysis.)
5/10 - Executing task StaleImagesRemovalTask (Crawl for stale images (either missing in filesystem or under .nomedia) and remove them from DB)
6/10 - Executing task CreateClustersTask (Create new persons or update existing persons)
        Clusters already exist, estimated there is no need to recreate them
        Skipping cluster creation, not enough data (yet) collected. For cluster creation, you need either one of the following:
        * have 1000 faces already processed
        * or you need to have 95% of you images processed
        Use stats command to track progress
7/10 - Executing task AddMissingImagesTask (Crawl for missing images for each user and insert them in DB)
        Skipping image scan for user admin that has disabled the analysis
8/10 - Executing task EnumerateImagesMissingFacesTask (Find all images which don't have faces generated for them)
9/10 - Executing task ImageProcessingTask (Process all images to extract faces)
        NOTE: Starting face recognition. If you experience random crashes after this point, please look FAQ at https://github.com/matiasdelellis/facerecognition/wiki/FAQ
10/10 - Executing task UnlockTask (Release obtained lock)

occ face:stats

+--------+--------+-------+---------+
| User   | Images | Faces | Persons |
+--------+--------+-------+---------+
| admin  | 0      | 0     | 0       |
| my_user| 3694   | 2331  | 1530    |
+--------+--------+-------+---------+
matiasdelellis commented 2 years ago

Hi @KoMa1012

when running occ face: backgroud_job there is a message I would not expect being there: Skipping cluster creation, not enough data (yet) collected. For cluster creation, you need either one of the following:

  • have 1000 faces already processed
  • or you need to have 95% of you images processed

This message in this case you are right.. You already generated the initial groups, so it doesn't make any sense...

I will improve this. Thanks!. 😉

Regarding your other doubts (20 times you had to assign yourself, single face clusters, etc), is the way to ensure good quality in the expected result. Increasing the sensitivity/clustering tresholds, you will reduce your face clusters to maybe 3 clusters, but some error can be introduced. On the other hand, very small faces (less than 50px), are never clustered, also trying to avoid some unwanted errors... Also, according to the model you use, there are also side faces, which are avoided automatically clustering.. again to improve the quality of the result.

See:

well, these articles need to be updated... 😞

KoMa1012 commented 2 years ago

Hello @matiasdelellis ,

thank you for looking into my report! I've been trying some adaptions in the last days, but I'm mostly happy with the performance of the Model 4 with the default settings.

I would love to help with the articles, unfortunately my knowledge about this this app is way to small. But I've got an Idea about a feature (use google detected faces as input) I would love to see in this app, so I might now need to learn how to develop nextcloud apps and deliver you said idea in probably 2-3 years as a pull request... :-)

Since I'm not sure about netique here, I'll let this issue report open, I assume you will close it once you fixed the issue with the message in background_job.