immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
43.95k stars 2.15k forks source link

Immich Unresponsive After Initial Boot #11381

Closed SummitPatel closed 1 month ago

SummitPatel commented 1 month ago

The bug

Hi,

I've been trying to get the Unraid version of Immich up and running for a few days no with no luck. After installing the docker image and running the container, everything seems to boot as expected initially. After a few minutes however, my CPU cores are maxed out. I am not able to access the webUI.

I've had this issue since v1.108.0 when I initially tried to set this up. After 2 versions and the same behavior happening, I'm not sure if this a user issue or something with Immich itself.

The OS that Immich Server is running on

Unraid 6.12.10

Version of Immich Server

v1.110.0

Version of Immich Mobile App

N/A

Platform with the issue

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.transcoding.yml
    #   service: cpu # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:328fe6a5822256d065debb36617a8169dbfbd77b797c525288e465f56c1d392b
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      # start_interval: 30s
      # start_period: 5m
    command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: always

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/mnt/user/data/media/photos
# The location where your database files are stored
DB_DATA_LOCATION=/mnt/user/appdata/immich/postresql/data

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
# TZ=Etc/UTC

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=REDACTED

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=REDACTED
DB_DATABASE_NAME=REDACTED

Reproduction steps

1. I run `Compose Up` through the Unraid GUI
2. Wait for containers to start. Press `Done` once I receive the `Connection Closed` message
3. Try to hit the webUI and am met with a black screen

Relevant log output

immich_postgres

2024-07-26 18:24:00.865 UTC [1] LOG:  redirecting log output to logging collector process
2024-07-26 18:24:00.865 UTC [1] HINT:  Future log output will appear in directory "log".

PostgreSQL Database directory appears to contain a database; Skipping initialization

immich_redis

1:C 26 Jul 2024 18:23:49.558 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 26 Jul 2024 18:23:49.558 # Redis version=6.2.14, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 26 Jul 2024 18:23:49.558 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 26 Jul 2024 18:23:49.559 * monotonic clock: POSIX clock_gettime
1:M 26 Jul 2024 18:23:49.560 * Running mode=standalone, port=6379.
1:M 26 Jul 2024 18:23:49.560 # Server initialized
1:M 26 Jul 2024 18:23:49.560 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 26 Jul 2024 18:23:49.560 * Ready to accept connections

immich_machine_learning

[07/26/24 18:24:04] INFO     Starting gunicorn 22.0.0                           
[07/26/24 18:24:04] INFO     Listening at: http://[::]:3003 (8)                 
[07/26/24 18:24:04] INFO     Using worker: app.config.CustomUvicornWorker       
[07/26/24 18:24:04] INFO     Booting worker with pid: 9

immich_server

Detected CPU Cores: 4

Additional information

Please let me know if there is any other information I can provide to help troubleshoot.

alextran1502 commented 1 month ago

@zackpollard Please correct me if I am wrong; when the server starts up, microservices get the updated information for geocoding data and update the table. Depending on your CPU/Disk setup, the finish time might differ.

zackpollard commented 1 month ago

@zackpollard Please correct me if I am wrong; when the server starts up, microservices get the updated information for geocoding data and update the table. Depending on your CPU/Disk setup, the finish time might differ.

The server should still boot if I remember correctly, but microservices won't process any exif data until that stuff finishes importing.

alextran1502 commented 1 month ago

Yeah, the server should still boot, unless the CPU of this server is very underpowered so it is using everything to read/write the data to the database and block the requests

SummitPatel commented 1 month ago

Appreciate the quick replies!

The server should still boot if I remember correctly, but microservices won't process any exif data until that stuff finishes importing.

Is this something that would happen on existing images? Because I should say that this is a brand new server with 0 images inside of it. If not, I'll try and let this run for a few minutes to see if it eventually starts up.

I'm just worried about letting my CPUs run at 100% for an extended period of time 😅

alextran1502 commented 1 month ago

Oh your CPU will max out for a long time for initial ingestion :D

SummitPatel commented 1 month ago

Yeah, the server should still boot, unless the CPU of this server is very underpowered so it is using everything to read/write the data to the database and block the requests

I'm running a Intel® Celeron® N5105 @ 2.00GHz.

This is all happening on an Asustor Lockerstor 4 Gen2 AS6704T.

I've actually ordered some RAM that I plan to install so maybe that will help with the load, but for right now it's running at the base 4GB of RAM.

SummitPatel commented 1 month ago

Oh your CPU will max out for a long time for initial ingestion :D

Ok let me let this run for 30 minutes. I'll report back and see what happens

SummitPatel commented 1 month ago

Everything is still running at 100% 😓

Should I expect to see anything new in the logs? Nothing has changed in the machine learning or immich server logs so far

zackpollard commented 1 month ago

I wouldn't expect it to have issues like this on initial boot, not sure what exactly it could be doing to be honest.

zackpollard commented 1 month ago

It looks to me that it never actually created the microservices or server workers for some reason...

SummitPatel commented 1 month ago

Maybe this is helpful, maybe not, but I was also trying to get up and running with the community maintained image last week. The same issue was happening, but I was able to get some more logs to display.

Additionally, I could at least get into the webUI for a few seconds. But then it would hang again and lock up as it's currently doing.

https://github.com/imagegenius/docker-immich/issues/418

SummitPatel commented 1 month ago

Ok I think I can confidently say that this was an issue caused by low RAM. I upgraded from 4GB to 16GB over the weekend and now I am able to get past the initial boot sequence in under 2 minutes and everything remains stable.

I see on the Requirements page that 4GB is the minimum, but I would humbly ask to consider bumping that up to 6GB.

If this seems like a good idea, let me know and I can submit a PR to update the docs.

Cheers!