immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
52.65k stars 2.79k forks source link

immich v1.120.2 keep restarting and high CPU usage #14123

Closed henryxrl closed 1 week ago

henryxrl commented 1 week ago

The bug

I just updated to v1.120.2, but immich_server keeps restarting and occupies 100% CPU.

The OS that Immich Server is running on

Ubuntu 22

Version of Immich Server

v1.120.2

Version of Immich Mobile App

v1.120.2

Platform with the issue

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.transcoding.yml
    #   service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - '2283:2283'
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:2ba50e1ac3a0ea17b736ce9db2b0a9f6f8b85d4c27d5f5accc6a416d8f42c6d5
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      start_interval: 30s
      start_period: 5m
    command:
      [
        'postgres',
        '-c',
        'shared_preload_libraries=vectors.so',
        '-c',
        'search_path="$$user", public, vectors',
        '-c',
        'logging_collector=on',
        '-c',
        'max_wal_size=2GB',
        '-c',
        'shared_buffers=512MB',
        '-c',
        'wal_compression=on',
      ]
    restart: always

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=./library
# The location where your database files are stored
DB_DATA_LOCATION=./postgres

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
TZ=America/New_York

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
# Please use only the characters `A-Za-z0-9`, without special characters or spaces
DB_PASSWORD=xxxxx

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

Reproduction steps

  1. docker rmi xxxxxxx (I first removed all the immich images)
  2. docker compose up -d --pull always (always pull the latest version)
  3. After a while, CPU reaches 100% usage. ...

Relevant log output

immich_postgres  |
immich_postgres  | PostgreSQL Database directory appears to contain a database; Skipping initialization
immich_postgres  |
immich_postgres  | 2024-11-13 17:01:48.852 UTC [1] LOG:  redirecting log output to logging collector process
immich_postgres  | 2024-11-13 17:01:48.852 UTC [1] HINT:  Future log output will appear in directory "log".
immich_machine_learning  | [11/13/24 12:01:50] INFO     Starting gunicorn 23.0.0
immich_machine_learning  | [11/13/24 12:01:50] INFO     Listening at: http://[::]:3003 (9)
immich_machine_learning  | [11/13/24 12:01:50] INFO     Using worker: app.config.CustomUvicornWorker
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # Redis version=6.2.16, bits=64, commit=00000000, modified=0, pid=1, just started
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
immich_redis             | 1:M 13 Nov 2024 17:01:48.615 * monotonic clock: POSIX clock_gettime
immich_redis             | 1:M 13 Nov 2024 17:01:48.617 * Running mode=standalone, port=6379.
immich_redis             | 1:M 13 Nov 2024 17:01:48.617 # Server initialized
immich_machine_learning  | [11/13/24 12:01:50] INFO     Booting worker with pid: 10
immich_machine_learning  | [11/13/24 12:02:02] INFO     Started server process [10]
immich_machine_learning  | [11/13/24 12:02:02] INFO     Waiting for application startup.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Created in-memory cache with unloading after 300s
immich_machine_learning  |                              of inactivity.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Initialized request thread pool with 8 threads.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Application startup complete.
immich_machine_learning  | [11/13/24 12:03:20] ERROR    Worker (pid:10) was sent SIGKILL! Perhaps out of
immich_machine_learning  |                              memory?
immich_machine_learning  | [11/13/24 12:03:20] INFO     Booting worker with pid: 28
immich_redis             | 1:M 13 Nov 2024 17:01:48.618 * Ready to accept connections
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 8
immich_server            | Starting api worker
immich_server            | Starting microservices worker
immich_server            | (node:8) [DEP0060] DeprecationWarning: The `util._extend` API is deprecated. Please use Object.assign() instead.
immich_server            | (Use `node --trace-deprecation ...` to show where the warning was created)
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 8
➜  docker-compose docker compose logs
immich_machine_learning  | [11/13/24 12:01:50] INFO     Starting gunicorn 23.0.0
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # Redis version=6.2.16, bits=64, commit=00000000, modified=0, pid=1, just started
immich_redis             | 1:C 13 Nov 2024 17:01:48.615 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
immich_redis             | 1:M 13 Nov 2024 17:01:48.615 * monotonic clock: POSIX clock_gettime
immich_redis             | 1:M 13 Nov 2024 17:01:48.617 * Running mode=standalone, port=6379.
immich_redis             | 1:M 13 Nov 2024 17:01:48.617 # Server initialized
immich_redis             | 1:M 13 Nov 2024 17:01:48.618 * Ready to accept connections
immich_machine_learning  | [11/13/24 12:01:50] INFO     Listening at: http://[::]:3003 (9)
immich_machine_learning  | [11/13/24 12:01:50] INFO     Using worker: app.config.CustomUvicornWorker
immich_machine_learning  | [11/13/24 12:01:50] INFO     Booting worker with pid: 10
immich_machine_learning  | [11/13/24 12:02:02] INFO     Started server process [10]
immich_machine_learning  | [11/13/24 12:02:02] INFO     Waiting for application startup.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Created in-memory cache with unloading after 300s
immich_machine_learning  |                              of inactivity.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Initialized request thread pool with 8 threads.
immich_machine_learning  | [11/13/24 12:02:02] INFO     Application startup complete.
immich_machine_learning  | [11/13/24 12:03:20] ERROR    Worker (pid:10) was sent SIGKILL! Perhaps out of
immich_machine_learning  |                              memory?
immich_machine_learning  | [11/13/24 12:03:20] INFO     Booting worker with pid: 28
immich_machine_learning  | [11/13/24 12:05:05] ERROR    Worker (pid:28) was sent SIGKILL! Perhaps out of
immich_machine_learning  |                              memory?
immich_machine_learning  | [11/13/24 12:05:06] INFO     Booting worker with pid: 44
immich_machine_learning  | [11/13/24 12:05:51] INFO     Started server process [44]
immich_machine_learning  | [11/13/24 12:05:51] INFO     Waiting for application startup.
immich_machine_learning  | [11/13/24 12:05:51] INFO     Created in-memory cache with unloading after 300s
immich_machine_learning  |                              of inactivity.
immich_machine_learning  | [11/13/24 12:05:51] INFO     Initialized request thread pool with 8 threads.
immich_machine_learning  | [11/13/24 12:05:51] INFO     Application startup complete.
immich_postgres          |
immich_postgres          | PostgreSQL Database directory appears to contain a database; Skipping initialization
immich_postgres          |
immich_postgres          | 2024-11-13 17:01:48.852 UTC [1] LOG:  redirecting log output to logging collector process
immich_postgres          | 2024-11-13 17:01:48.852 UTC [1] HINT:  Future log output will appear in directory "log".
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 8
immich_server            | Starting api worker
immich_server            | Starting microservices worker
immich_server            | (node:8) [DEP0060] DeprecationWarning: The `util._extend` API is deprecated. Please use Object.assign() instead.
immich_server            | (Use `node --trace-deprecation ...` to show where the warning was created)
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 8
immich_server            | Starting api worker
immich_server            | Starting microservices worker
immich_server            | (node:7) [DEP0060] DeprecationWarning: The `util._extend` API is deprecated. Please use Object.assign() instead.
immich_server            | (Use `node --trace-deprecation ...` to show where the warning was created)
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 8

Additional information

I see an Error in the log, Worker (pid:10) was sent SIGKILL! Perhaps out of memory? but the system has 16GB of RAM and only less than 10% is utilized.

I also ensured that the UPLOAD_LOCATION and DB_DATA_LOCATION directories all have 775 permission.

alextran1502 commented 1 week ago

cc @zackpollard

hampta commented 1 week ago

+1 same problem

log

immich_postgres  |
immich_postgres  | PostgreSQL Database directory appears to contain a database; Skipping initialization
immich_postgres  |
immich_postgres  | 2024-11-14 13:15:34.376 UTC [1] LOG:  redirecting log output to logging collector process
immich_postgres  | 2024-11-14 13:15:34.376 UTC [1] HINT:  Future log output will appear in directory "log".
immich_machine_learning  | [11/14/24 13:17:16] INFO     Starting gunicorn 23.0.0
immich_machine_learning  | [11/14/24 13:17:16] INFO     Listening at: http://[::]:3003 (9)
immich_machine_learning  | [11/14/24 13:17:16] INFO     Using worker: app.config.CustomUvicornWorker
immich_machine_learning  | [11/14/24 13:17:16] INFO     Booting worker with pid: 10
immich_machine_learning  | [11/14/24 13:17:39] INFO     Started server process [10]
immich_machine_learning  | [11/14/24 13:17:39] INFO     Waiting for application startup.
immich_machine_learning  | [11/14/24 13:17:39] INFO     Created in-memory cache with unloading after 300s
immich_machine_learning  |                              of inactivity.
immich_machine_learning  | [11/14/24 13:17:39] INFO     Initialized request thread pool with 4 threads.
immich_machine_learning  | [11/14/24 13:17:39] INFO     Application startup complete.
immich_redis             | 1:C 14 Nov 2024 13:15:32.448 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
immich_redis             | 1:C 14 Nov 2024 13:15:32.448 # Redis version=6.2.16, bits=64, commit=00000000, modified=0, pid=1, just started
immich_redis             | 1:C 14 Nov 2024 13:15:32.448 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
immich_redis             | 1:M 14 Nov 2024 13:15:32.451 * monotonic clock: POSIX clock_gettime
immich_redis             | 1:M 14 Nov 2024 13:15:32.453 * Running mode=standalone, port=6379.
immich_redis             | 1:M 14 Nov 2024 13:15:32.453 # Server initialized
immich_redis             | 1:M 14 Nov 2024 13:15:32.453 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
immich_redis             | 1:M 14 Nov 2024 13:15:32.460 * Ready to accept connections
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
immich_server            | Initializing Immich v1.120.2
immich_server            | Detected CPU Cores: 4
alextran1502 commented 1 week ago

Will it be possible for you guys to increase the CPU Cores and test again? See if it helps

hampta commented 1 week ago

how

image

alextran1502 commented 1 week ago

@hampta Oh you are running on a RPi I assume? Pi 4? The resource is not enough to run Immich effectively unfortunately