nextcloud / recognize

👁 👂 Smart media tagging for Nextcloud: recognizes faces, objects, landscapes, music genres
https://apps.nextcloud.com/apps/recognize
GNU Affero General Public License v3.0
558 stars 46 forks source link

Face detection threshold too high #588

Closed farhills closed 1 year ago

farhills commented 1 year ago

Which version of recognize are you using?

3.3.3

Enabled Modes

Object recognition, Face recognition

TensorFlow mode

WASM mode

Which Nextcloud version do you have installed?

25.0.2

Which Operating system do you have installed?

Unraid 6.10

Which Docker container are you using to run Nextcloud? (if applicable)

Linuxserver : latest (v??)

How much RAM does your server have?

92G

What processor Architecture does your CPU have?

x86_64 (old dual Xeon)

Describe the Bug

Picking up the discussion from #475, the face detection threshold in 3.3.3 seems too conservative. I have just shy of 31,000 files inside my photos directory (see below). Even after excluding landscapes and movies, a significant fraction of those will have people. I would have expected the sum of all clusters to be 1000s of detected faces. My largest 3 clusters are: 69, 33 and 11 photos. The remaining few dozen clusters are single digits. Less than 200 face detections in total, spread across ~20 names.

I've also noticed virtually no children and zero baby face detections (I have two young kids, so there should be thousands of photos with them as the subjects).

Lastly, I can't find any group photos with more than one detected face. This could simply be because there are so few detected faces compared to the number of files.

If it's helpful, I can try to restore the DB from a backup made before I started fresh with 3.3.3. I was having issues with mega clusters, so never spent any time merging the clusters to names. But maybe a query to compare the aggregate numbers would be useful? Let me know, happy to help if I can.

File count in my nextcloud data directory, photos subdirectory:

root@xxx: .../farhills/files/Photos# rsync --stats --dry-run -ax . /tmp

Number of files: 31,473 (reg: 30,639, dir: 834)
Number of created files: 31,472 (reg: 30,639, dir: 833)
Number of deleted files: 0
Number of regular files transferred: 30,639
Total file size: 214,089,739,148 bytes
Total transferred file size: 214,089,739,148 bytes

Expected Behavior

Detect a significant percentage of faces present in photos.

To Reproduce

Automatic run with standard settings.

Debug log

No response

danyboypremier commented 1 year ago

I'm curious about that too. I notice the same behaviour. 95% of my photos should have faces. I've got one or tow faces per people.

marcelklehr commented 1 year ago

Unfortunately with a lower threshold a lot of non-faces enter the field. I'm currently sussing out what can be done

shalak commented 1 year ago

@farhills @danyboypremier are you sure that this is a case of a low threshold?

I noticed that when I first uploaded a batch of pictures, I got a lot of new faces. However, at some point, no more new faces appeared. Then, when I uploaded another batch of thousands of photos, some new faces showed up and then nothing more. This happened again with another batch.

I think the recognition problem may be caused by the process crashing (maybe due to running out of memory or encountering a corrupted file). It doesn't restart from the point of failure. But when we have another set of pictures to recognize, it starts fresh and works for a while again.

@marcelklehr I'm sorry to bother you, but I'm really new into Nextcloud architecture. How could we diagnose the exact process of the face recognition? Does it log anything anywhere? I'm running nextcloud-aio 25.0.3, witch app version 3.3.6.

phil-lipp commented 1 year ago

@farhills @danyboypremier are you sure that this is a case of a low threshold?

I noticed that when I first uploaded a batch of pictures, I got a lot of new faces. However, at some point, no more new faces appeared. Then, when I uploaded another batch of thousands of photos, some new faces showed up and then nothing more. This happened again with another batch.

I think the recognition problem may be caused by the process crashing (maybe due to running out of memory or encountering a corrupted file). It doesn't restart from the point of failure. But when we have another set of pictures to recognize, it starts fresh and works for a while again.

@marcelklehr I'm sorry to bother you, but I'm really new into Nextcloud architecture. How could we diagnose the exact process of the face recognition? Does it log anything anywhere? I'm running nextcloud-aio 25.0.3, witch app version 3.3.6.

I'm experiencing the same behaviour as you and even did a full wipe already after which classification of different faces happened than in the previous try, but at some point in the last couple of weeks it just stopped and now no more new faces are added, even though classification seems to be running.

I find the logs that recognize puts out (e.g. by running occ recognize:classify to be lackluster when it comes to identifying the problem. The logs don't contain the actual file names and thus it's kinda difficult to see whether there's a certain image file that's blocking the classification from progressing or if it is even progressing at all. Having the actual file names (maybe as a toggle in the settings) displayed in the logs; I could look at the actual pictures and see whether there's actually a face in the picture or not.

marcelklehr commented 1 year ago

Does it log anything anywhere?

Yep, setting Nextcloud log level to debug should reveal details about the goings-on.

The logs don't contain the actual file names

That's a good point. After the temporary files feature was introduced, the file names are no longer meaningful. I'll fix that.

marcelklehr commented 1 year ago

Face detection stopping inexplicably is also discussed at #562 and #652, fyi

marcelklehr commented 1 year ago

That's a good point. After the temporary files feature was introduced, the file names are no longer meaningful. I'll fix that.

This is fixed now since v3.4.0

shalak commented 1 year ago

Yep, setting Nextcloud log level to debug should reveal details about the goings-on.

So the errors are also on debug log level?

This is fixed now since v3.4.0

How should I re-trigger the clasification? Using the "Rescan all files" button or occ recognize:recrawl? What happens if I do it - will it remember my current people names, and the merged people?

marcelklehr commented 1 year ago

So the errors are also on debug log level?

Mh, no. Errors should be at least a warning.

Using the "Rescan all files" button or occ recognize:recrawl?

They do the same thing.

What happens if I do it - will it remember my current people names, and the merged people?

The recrawl command will re-scan all your files, but will only reprocess faces for files that have no face detections in the database yet. If you want to reprocess those, you'll have to run reset-faces, but then you'll also lose your people names and merged people etc.

marcelklehr commented 1 year ago

Let's continue the conversation about mistakes in face clustering over in the forum: #754

Since v3.3.x was published we have improved this alot, though.