immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
52.01k stars 2.76k forks source link

Spurious face recognition (random photos of random things and people recognized as one face) #12450

Open eacunha opened 2 months ago

eacunha commented 2 months ago

The bug

Immich recognized about 200 photos from completely different things, people, landscapes, foods, game screenshots (completely unrelated photos) as being one person in face recognition. See attached images below, it makes it very obvious to understand the issue.

image

And here are some of the photos that were associated with this "person":

image

As can be easily seen, the first is a dog with black background, the second is a screenshot from Genshin Impact, the third is a group of people on a grass, the forth is a completely unrelated group of people on the snow, the fifth is some birds in the rain and the last is some foods in a pan. 100% unrelated photos. It should not have bundled these photos as one person.

Any way I can "delete" this person?

The OS that Immich Server is running on

Raspberry Pi OS 64 bit latest version

Version of Immich Server

v1.113.1

Version of Immich Mobile App

N/A

Platform with the issue

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.transcoding.yml
    #   service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /home/edu_adm/EGNAS2_shared_folder:/home/edu_adm/EGNAS2_shared_folder
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always
    healthcheck:
      disable: false

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
    healthcheck:
      disable: false

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:e3b17ba9479deec4b7d1eeec1548a253acc5374d68d3b27937fcfe4df8d18c7e
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; >      interval: 5m
      start_interval: 30s
      start_period: 5m
    command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: always

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=./library
# The location where your database files are stored
DB_DATA_LOCATION=./postgres

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
# TZ=Etc/UTC

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
# Please use only the characters `A-Za-z0-9`, without special characters or spaces
DB_PASSWORD=<my_pass_redacted_to_post_here_on_github!>

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

Reproduction steps

I think it might be difficult to reproduce, but once it happens it can be observed by:

  1. Go to "explore"
  2. Click on "view all" of "People"
  3. Scroll to find the affected "Person" (which is a spurious collection of Photos) ...

Relevant log output

No response

Additional information

If you need help to debug this I am available to jump on discord to share the screen/do what is needed to help : )

bo0tzz commented 2 months ago

Have you changed any of your machine learning settings at all?

eacunha commented 2 months ago

no, at least not that I know of... how can this be changed?

eacunha commented 2 months ago

These are the settings there, I don't recall changing anything: image

bo0tzz commented 2 months ago

Those seem like just the defaults, so I really have no idea why it would have done this. @mertalev any ideas?

mertalev commented 2 months ago

That's super weird. What do the bounding boxes for these faces (or "faces") look like? You can open an image's info panel and hover over the person to see it.

alextran1502 commented 2 months ago

Did you change the thumbnail settings? Can you post it?

image
eacunha commented 2 months ago

When I mouse over this "person", it detects the face of the genshin character: image

eacunha commented 2 months ago

.. or the "face" of the duck: image

eacunha commented 2 months ago

or a potato: image

eacunha commented 2 months ago

Did you change the thumbnail settings? Can you post it?

image

I think I have not changed that either, here it is: image

mertalev commented 2 months ago

Could you run a few SQL queries for me and share the output for each?

select * from pg_vector_index_stat;
select count(*) from asset_faces;
select count(*) from face_search;
with
  embeddings as (
    select "originalFileName", embedding
    from
      assets
        inner join asset_faces
          on assets.id = asset_faces."assetId"
        inner join face_search
          on asset_faces.id = face_search."faceId"
    where
        assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg')
  )
select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance
from embeddings this, embeddings other;
eacunha commented 2 months ago

Could you run a few SQL queries for me and share the output for each?

select * from pg_vector_index_stat;
select count(*) from asset_faces;
select count(*) from face_search;
with
  embeddings as (
    select "originalFileName", embedding
    from
      assets
        inner join asset_faces
          on assets.id = asset_faces."assetId"
        inner join face_search
          on asset_faces.id = face_search."faceId"
    where
        assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg')
  )
select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance
from embeddings this, embeddings other;

can you help me how/where exactly I can do that?

mertalev commented 2 months ago

You can run docker exec -it immich_postgres psql --dbname=immich --username=<DB_USERNAME> to connect to the database via the container directly, where <DB_USERNAME> is the value from your .env file. Then, you can just paste in a query and hit enter.

SerAlbi commented 2 months ago

I have the same issue unfortunately. At least it happens just a couple of times in the people section so i just hide the 'fake' person detected

eacunha commented 1 month ago

immich=# select * from pg_vector_index_stat;

tablerelid | indexrelid | tablename | indexname | idx_status | idx_indexing | idx_tuples | idx_sealed | idx_growing | idx_write | idx_size | idx_options
------------+------------+--------------+------------+------------+--------------+------------+------------+-------------+-----------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 17319 | 17331 | smart_search | clip_index | NORMAL | t | 130483 | {130413} | {70} | 0 | 285602760 | {"vector":{"dimensions":512,"distance":"Cos","kind":"F32"},"segment":{"max_growing_segment_size":20000,"max_sealed_segment_size":1000000},"optimizing":{"sealing_secs":60,"sealing_size":1,"optimizing_threads":2},"indexing":{"hnsw":{"m":16,"ef_construction":300,"quantization":{"trivial":{}}}}} 17551 | 17575 | face_search | face_index | NORMAL | f | 116850 | {116752} | {} | 98 | 257665816 | {"vector":{"dimensions":512,"distance":"Cos","kind":"F32"},"segment":{"max_growing_segment_size":20000,"max_sealed_segment_size":1000000},"optimizing":{"sealing_secs":60,"sealing_size":1,"optimizing_threads":2},"indexing":{"hnsw":{"m":16,"ef_construction":300,"quantization":{"trivial":{}}}}} (2 rows)

immich=# select count(*) from asset_faces; count

90480 (1 row)

immich=# select count(*) from face_search; count

90480 (1 row)

immich=# with embeddings as ( select "originalFileName", embedding from assets inner join asset_faces on assets.id = asset_faces."assetId" inner join face_search on asset_faces.id = face_search."faceId" where assets."originalFileName" in ('20220408091020.png', '20211030_115938.jpg', '20230306_123821.jpg') ) select this."originalFileName" image1, other."originalFileName" image2, this.embedding <=> other.embedding distance from embeddings this, embeddings other; image1 | image2 | distance ---------------------+---------------------+------------ 20211030_115938.jpg | 20211030_115938.jpg | 0 20211030_115938.jpg | 20220408091020.png | 0.7072995 20211030_115938.jpg | 20230306_123821.jpg | 0.68505836 20220408091020.png | 20211030_115938.jpg | 0.7072995 20220408091020.png | 20220408091020.png | 0 20220408091020.png | 20230306_123821.jpg | 0.7904459 20230306_123821.jpg | 20211030_115938.jpg | 0.68505836 20230306_123821.jpg | 20220408091020.png | 0.7904459 20230306_123821.jpg | 20230306_123821.jpg | 0 (9 rows)