immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
44.98k stars 2.18k forks source link

[BUG] Container immich-machine-learning keep crashing #2858

Closed jpoggi closed 1 year ago

jpoggi commented 1 year ago

Hi,

Since a few updates one container on the stack keep crashing in a loop with this error message:

immich-machine-learning  | INFO:     Started server process [1]
immich-machine-learning  | INFO:     Waiting for application startup.
immich-machine-learning  | Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
immich-machine-learning exited with code 0

Can you point me the right direction on how to debug this ?

Thanks.

The OS that Immich Server is running on

Ubuntu 22.04

Version of Immich Server

v1.62.0

Version of Immich Mobile App

v1.62.0

Platform with the issue

Your docker-compose.yml content

services:
  redis.server.immich:
    container_name: redis.server.immich
    image: redis:7
    restart: unless-stopped
    mem_limit: 512m
    volumes:
      - /srv/immich/redis_01:/data
    healthcheck:
      test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
      interval: 60s
      retries: 5
      start_period: 10s
      timeout: 30s

  postgres.server.immich:
    container_name: postgres.server.immich
    image: postgres:15
    restart: unless-stopped
    mem_limit: 512m
    environment:
      POSTGRES_DB: immich
      POSTGRES_USER: immich
      POSTGRES_PASSWORD: $DB_PASSWORD
      PG_DATA: /var/lib/postgresql/data
    volumes:
      - /srv/immich/db_01:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-q", "-d", "immich", "-U", "immich"]
      interval: 60s
      retries: 5
      start_period: 10s
      timeout: 30s

  immich-server:
    container_name: immich-server
    image: ghcr.io/immich-app/immich-server:v1.62.0
    entrypoint: ["/bin/sh", "./start-server.sh"]
    restart: unless-stopped
    mem_limit: 512m
    volumes:
      - /srv/immich/vol_01:/usr/src/app/upload
    environment:
      TYPESENSE_API_KEY: $TYPESENSE_API_KEY
      DB_HOSTNAME: postgres.server.immich
      DB_DATABASE_NAME: immich
      DB_USERNAME: immich
      DB_PASSWORD: $DB_PASSWORD
      REDIS_HOSTNAME: redis.server.immich
    depends_on:
      - redis.server.immich
      - postgres.server.immich

  immich-microservices:
    container_name: immich-microservices
    image: ghcr.io/immich-app/immich-server:v1.62.0
    restart: unless-stopped
    mem_limit: 1024m
    entrypoint: ["/bin/sh", "./start-microservices.sh"]
    volumes:
      - /srv/immich/vol_01:/usr/src/app/upload
    environment:
      TYPESENSE_API_KEY: $TYPESENSE_API_KEY
      DB_HOSTNAME: postgres.server.immich
      DB_DATABASE_NAME: immich
      DB_USERNAME: immich
      DB_PASSWORD: $DB_PASSWORD
      REDIS_HOSTNAME: redis.server.immich
    depends_on:
      - redis.server.immich
      - postgres.server.immich

  immich-machine-learning:
    container_name: immich-machine-learning
    image: ghcr.io/immich-app/immich-machine-learning:v1.62.0
    restart: unless-stopped
    mem_limit: 512m
    volumes:
      - /srv/immich/vol_01:/usr/src/app/upload
      - /srv/immich/vol_02:/cache

  typesense:
    container_name: typesense
    image: typesense/typesense:0.24.0
    restart: always
    environment:
      TYPESENSE_API_KEY: $TYPESENSE_API_KEY
      TYPESENSE_DATA_DIR: /data
    logging:
      driver: none
    volumes:
      - /srv/immich/vol_03:/data

  immich-web:
    container_name: immich-web
    image: ghcr.io/immich-app/immich-web:v1.62.0
    restart: unless-stopped
    mem_limit: 512m
    environment:
      PUBLIC_LOGIN_PAGE_MESSAGE: Welcome on Immich.
    entrypoint: ["/bin/sh", "./entrypoint.sh"]

  immich-proxy:
    container_name: immich-proxy
    image: ghcr.io/immich-app/immich-proxy:v1.62.0
    restart: unless-stopped
    mem_limit: 512m
    environment:
      IMMICH_WEB_URL: http://immich-web:3000
      IMMICH_SERVER_URL: http://immich-server:3001
      IMMICH_MACHINE_LEARNING_URL: http://immich-machine-learning:3003
    ports:
      - 2283:8080
    logging:
      driver: none
    depends_on:
      - immich-server

Your .env content

NA

Reproduction steps

1. Just reboot the stack.

Additional information

Docker version

docker --version
Docker version 24.0.2, build cb74dfc

Docker compose version:

Docker Compose version v2.18.1
mertalev commented 1 year ago

Does this still happen when mem_limit is removed? For the ML logs, the first three lines are normal and the last one means it exited without error, so I imagine it's docker that's shutting it down because it's using more memory than allowed.

alextran1502 commented 1 year ago

machine-learning will need at least 2~3GB of RAM to load all the models I believe.

jpoggi commented 1 year ago

It seems ok now.

Container is consuming around 2gb of memory without setting any limitations.

2ba47c878de0   immich-machine-learning        0.17%     1.869GiB / 15.31GiB   12.21%    290MB / 719kB     2.6GB / 2.52GB    29

I can't believe it was just a memory issue.

Thanks for the quick and just answer.

Regards.