immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
49.64k stars 2.62k forks source link

Machine Learning container stopped working on v1.109.0 #11188

Closed JordyEGNL closed 3 months ago

JordyEGNL commented 3 months ago

The bug

immich-machine-learning stopped working with v1.109.0 release

What I have tried:

The OS that Immich Server is running on

Ubuntu Server 22.04 LTS

Version of Immich Server

v1.109.0

Version of Immich Mobile App

v1.109.0

Platform with the issue

Your docker-compose.yml content

services:
  immich-server:
    container_name: immich-server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    group_add:
      - "109"
    extends:
      file: hwaccel.transcoding.yml
      service: quicksync
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 3001:3001
    env_file:
      - ./.env
    depends_on:
      - redis
      - database
    labels:
      traefik.enable: true
      traefik.http.routers.immich.entryPoints: https
      traefik.http.services.immich.loadbalancer.server.port: 3001
      traefik.http.routers.immich.rule: Host(`photos.domain.tld`)
      com.centurylinklabs.watchtower.monitor-only: true
    restart: always
    networks:
      - immich
      - proxy
  immich-machine-learning:
    container_name: immich-machine-learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    env_file:
      - ./.env
    volumes:
      - model-cache:/cache
    labels:
      com.centurylinklabs.watchtower.monitor-only: true
    restart: always
    networks:
      - immich
  redis:
    container_name: immich-redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:328fe6a5822256d065debb36617a8169dbfbd77b797c525288e465f56c1d392b
    healthcheck:
      test: redis-cli ping || exit 1
    labels:
      com.centurylinklabs.watchtower.monitor-only: true
    restart: always
    networks:
      - immich
  database:
    container_name: immich-postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - ./.env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: --data-checksums
    labels:
      com.centurylinklabs.watchtower.monitor-only: true
    volumes:
      - /mnt/media/immich/pgdata:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' ||
        exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}'
        --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT
        COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo
        "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      start_interval: 30s
      start_period: 5m
    command:
      - postgres
      - -c
      - shared_preload_libraries=vectors.so
      - -c
      - search_path="$$user", public, vectors
      - -c
      - logging_collector=on
      - -c
      - max_wal_size=2GB
      - -c
      - shared_buffers=512MB
      - -c
      - wal_compression=on
    restart: always
    networks:
      - immich
networks:
  proxy:
    name: proxy
    external: true
  immich:
    name: immich
    external: true
volumes:
  model-cache:

Your .env content

IMMICH_VERSION=release
DB_HOSTNAME=immich-postgres
DB_USERNAME=postgres
DB_PASSWORD=REDACTED
DB_DATABASE_NAME=immich
REDIS_HOSTNAME=immich-redis
UPLOAD_LOCATION=/mnt/media/immich/uploads
IMMICH_WEB_URL=http://immich-web:3000
IMMICH_SERVER_URL=http://immich-server:3001
IMMICH_MACHINE_LEARNING_URL=http://immich-machine-learning:3003
TYPESENSE_API_KEY=REDACTED
TZ=Europe/Amsterdam
IMMICH_METRICS=true

Reproduction steps

1. From v1.108.0 do `docker compose pull` and `docker compose down`
2. Then do `docker compose up -d`
3. Watch logs with `docker compose logs -f`
4. See that immich-machine-learning stops with code 2

Relevant log output

immich-machine-learning  | usage: gunicorn [OPTIONS] [APP_MODULE]
immich-machine-learning  | gunicorn: error: argument -w/--workers: invalid int value: ''
immich-machine-learning exited with code 2

Additional information

No response

JordyEGNL commented 3 months ago

I see that immich-server was not yet updated (too impatient?) After updating the immich-server container everything seems to be working again!

JordyEGNL commented 3 months ago

Nope I was wrong the issue came back...

JordyEGNL commented 3 months ago

Logs after updating everything correctly this time

immich-server has been recreated
immich-server exited with code 143
immich-server has been recreated
immich-server            | Detected CPU Cores: 4
immich-server            | Starting api worker
immich-server            | Starting microservices worker
immich-server            | [Nest] 8  - 07/18/2024, 6:43:13 PM     LOG [Microservices:EventRepository] Initialized websocket server
immich-server            | [Nest] 18  - 07/18/2024, 6:43:14 PM     LOG [Api:EventRepository] Initialized websocket server
immich-machine-learning  | usage: gunicorn [OPTIONS] [APP_MODULE]
immich-machine-learning  | gunicorn: error: argument -w/--workers: invalid int value: ''
immich-machine-learning exited with code 2
immich-machine-learning  | usage: gunicorn [OPTIONS] [APP_MODULE]
immich-machine-learning  | gunicorn: error: argument -w/--workers: invalid int value: ''
immich-machine-learning exited with code 2
DocDrydenn commented 3 months ago

Same here.

alextran1502 commented 3 months ago

Hey @mertalev, is this related to the script?

urbantrout commented 3 months ago

Additionally to the error message above I get:

immich_server            | /usr/src/app/node_modules/sharp/lib/sharp.js:114
immich_server            |   throw new Error(help.join('\n'));
immich_server            |   ^
immich_server            |
immich_server            | Error: Could not load the "sharp" module using the linux-arm64 runtime
immich_server            | ERR_DLOPEN_FAILED: libwebpdemux.so.2: cannot open shared object file: No such file or directory
immich_server            | Possible solutions:
immich_server            | - Ensure optional dependencies can be installed:
immich_server            |     npm install --include=optional sharp
immich_server            |     yarn add sharp --ignore-engines
immich_server            | - Ensure your package manager supports multi-platform installation:
immich_server            |     See https://sharp.pixelplumbing.com/install#cross-platform
immich_server            | - Add platform-specific dependencies:
immich_server            |     npm install --os=linux --cpu=arm64 sharp
immich_server            | - Consult the installation documentation:
immich_server            |     See https://sharp.pixelplumbing.com/install
immich_server            |     at Object.<anonymous> (/usr/src/app/node_modules/sharp/lib/sharp.js:114:9)
immich_server            |     at Module._compile (node:internal/modules/cjs/loader:1358:14)
immich_server            |     at Module._extensions..js (node:internal/modules/cjs/loader:1416:10)
immich_server            |     at Module.load (node:internal/modules/cjs/loader:1208:32)
immich_server            |     at Module._load (node:internal/modules/cjs/loader:1024:12)
immich_server            |     at Module.require (node:internal/modules/cjs/loader:1233:19)
immich_server            |     at require (node:internal/modules/helpers:179:18)
immich_server            |     at Object.<anonymous> (/usr/src/app/node_modules/sharp/lib/constructor.js:10:1)
immich_server            |     at Module._compile (node:internal/modules/cjs/loader:1358:14)
immich_server            |     at Module._extensions..js (node:internal/modules/cjs/loader:1416:10)
immich_server            |
immich_server            | Node.js v20.15.1
9k001 commented 3 months ago

I have the same error.

usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: argument -w/--workers: invalid int value: ''
usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: argument -w/--workers: invalid int value: ''
usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: argument -w/--workers: invalid int value: ''
usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: argument -w/--workers: invalid int value: ''
usage: gunicorn [OPTIONS] [APP_MODULE]
gunicorn: error: argument -w/--workers: invalid int value: ''
sebasptsch commented 3 months ago

Additionally to the error message above I get:

immich_server            | /usr/src/app/node_modules/sharp/lib/sharp.js:114
immich_server            |   throw new Error(help.join('\n'));
immich_server            |   ^
immich_server            |
immich_server            | Error: Could not load the "sharp" module using the linux-arm64 runtime
immich_server            | ERR_DLOPEN_FAILED: libwebpdemux.so.2: cannot open shared object file: No such file or directory
immich_server            | Possible solutions:
immich_server            | - Ensure optional dependencies can be installed:
immich_server            |     npm install --include=optional sharp
immich_server            |     yarn add sharp --ignore-engines
immich_server            | - Ensure your package manager supports multi-platform installation:
immich_server            |     See https://sharp.pixelplumbing.com/install#cross-platform
immich_server            | - Add platform-specific dependencies:
immich_server            |     npm install --os=linux --cpu=arm64 sharp
immich_server            | - Consult the installation documentation:
immich_server            |     See https://sharp.pixelplumbing.com/install
immich_server            |     at Object.<anonymous> (/usr/src/app/node_modules/sharp/lib/sharp.js:114:9)
immich_server            |     at Module._compile (node:internal/modules/cjs/loader:1358:14)
immich_server            |     at Module._extensions..js (node:internal/modules/cjs/loader:1416:10)
immich_server            |     at Module.load (node:internal/modules/cjs/loader:1208:32)
immich_server            |     at Module._load (node:internal/modules/cjs/loader:1024:12)
immich_server            |     at Module.require (node:internal/modules/cjs/loader:1233:19)
immich_server            |     at require (node:internal/modules/helpers:179:18)
immich_server            |     at Object.<anonymous> (/usr/src/app/node_modules/sharp/lib/constructor.js:10:1)
immich_server            |     at Module._compile (node:internal/modules/cjs/loader:1358:14)
immich_server            |     at Module._extensions..js (node:internal/modules/cjs/loader:1416:10)
immich_server            |
immich_server            | Node.js v20.15.1

I believe this is a separate issue for arm

ALERTua commented 3 months ago

I fixed it by adding MACHINE_LEARNING_WORKERS=1 in my .env It seems this variable is empty by default, so gunicorn starts with -w '' instead of something like -w 1

JordyEGNL commented 3 months ago

I fixed it by adding MACHINE_LEARNING_WORKERS=1 in my .env

Can confirm that this fixes the issue

immich-machine-learning  | [07/18/24 19:01:53] INFO     Starting gunicorn 22.0.0                           
immich-machine-learning  | [07/18/24 19:01:53] INFO     Listening at: http://[::]:3003 (10)                
immich-machine-learning  | [07/18/24 19:01:53] INFO     Using worker: app.config.CustomUvicornWorker       
immich-machine-learning  | [07/18/24 19:01:53] INFO     Booting worker with pid: 11                        
immich-machine-learning  | [07/18/24 19:01:59] INFO     Started server process [11]                        
immich-machine-learning  | [07/18/24 19:01:59] INFO     Waiting for application startup.                   
immich-machine-learning  | [07/18/24 19:01:59] INFO     Created in-memory cache with unloading after 300s  
immich-machine-learning  |                              of inactivity.                                     
immich-machine-learning  | [07/18/24 19:01:59] INFO     Initialized request thread pool with 4 threads.    
immich-machine-learning  | [07/18/24 19:01:59] INFO     Application startup complete.  
urbantrout commented 3 months ago

immich_server | /usr/src/app/node_modules/sharp/lib/sharp.js:114 immich_server | throw new Error(help.join('\n')); immich_server | ^ immich_server | immich_server | Error: Could not load the "sharp" module using the linux-arm64 runtime immich_server | ERR_DLOPEN_FAILED: libwebpdemux.so.2: cannot open shared object file: No such file or directory immich_server | Possible solutions: immich_server | - Ensure optional dependencies can be installed: immich_server | npm install --include=optional sharp immich_server | yarn add sharp --ignore-engines immich_server | - Ensure your package manager supports multi-platform installation: immich_server | See https://sharp.pixelplumbing.com/install#cross-platform immich_server | - Add platform-specific dependencies: immich_server | npm install --os=linux --cpu=arm64 sharp immich_server | - Consult the installation documentation: immich_server | See https://sharp.pixelplumbing.com/install immich_server | at Object. (/usr/src/app/node_modules/sharp/lib/sharp.js:114:9) immich_server | at Module._compile (node:internal/modules/cjs/loader:1358:14) immich_server | at Module._extensions..js (node:internal/modules/cjs/loader:1416:10) immich_server | at Module.load (node:internal/modules/cjs/loader:1208:32) immich_server | at Module._load (node:internal/modules/cjs/loader:1024:12) immich_server | at Module.require (node:internal/modules/cjs/loader:1233:19) immich_server | at require (node:internal/modules/helpers:179:18) immich_server | at Object. (/usr/src/app/node_modules/sharp/lib/constructor.js:10:1) immich_server | at Module._compile (node:internal/modules/cjs/loader:1358:14) immich_server | at Module._extensions..js (node:internal/modules/cjs/loader:1416:10) immich_server | immich_server | Node.js v20.15.1

Gonna open a new issue for this.

urbantrout commented 3 months ago

Ah, there already is #11189

gnownairb95 commented 3 months ago

I fixed it by adding MACHINE_LEARNING_WORKERS=1 in my .env

This worked for me too!

RetroZelda commented 3 months ago

I fixed it by adding MACHINE_LEARNING_WORKERS=1 in my .env

This doesnt work for me. ive rolled it back to v1.108.0 to workaround it. im also running immich_machine_learning on a different machine than my immich instance, so maybe theres other env vars that are now needed when its standalone

bo0tzz commented 3 months ago

Will be fixed by #11192