immich-app / immich

High performance self-hosted photo and video management solution.
https://immich.app
GNU Affero General Public License v3.0
47.19k stars 2.39k forks source link

Transcoding generates too many files. #13156

Open dotfortun3-code opened 11 hours ago

dotfortun3-code commented 11 hours ago

The bug

I just started using Immich and have an external library with about 75k photos and 5k videos. I noticed that transcoding is taking a very long time. I have a Tesla P4 and it has been running for several days. When I checked the jobs, it had over 65k jobs in the queue and I checked the encoded_videos folder and there are 20k files and around 700gb of data which seems like too much.

I can't find where to grab the transcoding logs, all I can find are the server logs that show it successfully encoded using NVENC. Not sure if there are more informative logs somewhere else.

My entire library is around 2 TB, and for comparison.

The OS that Immich Server is running on

Ubuntu 22.04

Version of Immich Server

1.117.0

Version of Immich Mobile App

1.117.0 build.178

Platform with the issue

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends:
       file: hwaccel.transcoding.yml
       service: nvenc # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      # Do not edit the next line. If you want to change the media storage location on your system, edit the value of UPLOAD_LOCATION in the .env file
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - ${THUMB_LOCATION}:/usr/src/app/upload/thumbs
      - ${ENCODED_VIDEO_LOCATION}:/usr/src/app/upload/encoded-video
      - ${PROFILE_LOCATION}:/usr/src/app/upload/profile
      - ${FAMILY_LIBRARY}:/mnt/media/family
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always
    healthcheck:
      disable: false

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
       file: hwaccel.ml.yml
       service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
    healthcheck:
      disable: false

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:2d1463258f2764328496376f5d965f20c6a67f66ea2b06dc42af351f75248792
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: docker.io/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      # Do not edit the next line. If you want to change the database storage location on your system, edit the value of DB_DATA_LOCATION in the .env file
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      #start_interval: 30s
      start_period: 5m
    command: ["postgres", "-c", "shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
    restart: always

volumes:
  model-cache:

Your .env content

# The location where your uploaded files are stored
UPLOAD_LOCATION=/mnt/family_share/Family/Immich
THUMB_LOCATION=/mnt/local_share/Immich/thumbs
ENCODED_VIDEO_LOCATION=/mnt/local_share/Immich/encoded-video
PROFILE_LOCATION=/mnt/local_share/Immich/profile
FAMILY_LIBRARY=/mnt/family_share/Family/Photos
# The location where your database files are stored
DB_DATA_LOCATION=./db

# To set a timezone, uncomment the next line and change Etc/UTC to a TZ identifier from this list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List
# TZ=Etc/UTC

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
# Please use only the characters `A-Za-z0-9`, without special characters or spaces
DB_PASSWORD=****************

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=****************
DB_DATABASE_NAME=****************

Reproduction steps

  1. Add external library
  2. Trigger transcoding on entire library

Relevant log output

No response

Additional information

No response

mertalev commented 10 hours ago

I think there's a bug in that the external library scan queues transcoding explicitly, but also queues another job that later queues transcoding as well. It's transcoding each video twice.

dotfortun3-code commented 9 hours ago

That seems plausible. I turned off transcoding and let it delete all of the encoded videos and started it again. This time is started with 24k jobs, which still seems too high but when it started the first time it was over 60k for some reason. I am thinking it must be live photos, but I will see how much space it takes up when finished.