dbatkinsmith commented 5 months ago

Since the latest series of updates, I've had the following errors when running Immich via CA Docker on Unraid.

Unraid running v6.12.10
Immich installation has been running successfully for more than a year
Separate Postgres and REDIS Docker containers
All three containers running as hosts to give separate IP addresses outside of a DHCP pool (no conflicts)

Error: listen EADDRINUSE: address already in use :::8080 at Server.setupListenHandle [as _listen2] (node:net:1898:16) at listenInCluster (node:net:1946:12) at Server.listen (node:net:2044:7) at ExpressAdapter.listen (/app/immich/server/node_modules/@nestjs/platform-express/adapters/express-adapter.js:95:32) at /app/immich/server/node_modules/@nestjs/core/nest-application.js:180:30 at new Promise () at NestApplication.listen (/app/immich/server/node_modules/@nestjs/core/nest-application.js:170:16) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async bootstrap (/app/immich/server/dist/workers/api.js:53:20) { code: 'EADDRINUSE', errno: -98, syscall: 'listen', address: '::', port: 8080 } node:internal/process/promises:289 triggerUncaughtException(err, true / fromPromise /); ^

Error: listen EADDRINUSE: address already in use :::8080 at Server.setupListenHandle [as _listen2] (node:net:1898:16) at listenInCluster (node:net:1946:12) at Server.listen (node:net:2044:7) at ExpressAdapter.listen (/app/immich/server/node_modules/@nestjs/platform-express/adapters/express-adapter.js:95:32) at /app/immich/server/node_modules/@nestjs/core/nest-application.js:180:30 at new Promise () at NestApplication.listen (/app/immich/server/node_modules/@nestjs/core/nest-application.js:170:16) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async bootstrap (/app/immich/server/dist/workers/api.js:53:20) { code: 'EADDRINUSE', errno: -98, syscall: 'listen', address: '::', port: 8080 }

Node.js v20.14.0 api worker exited with code 1

It seems to be connected to the microservices change, but I am unable to find a fix.

Have tried a number of things, including:

Waiting for further updates (twice now without a fix)
Removing variables and using new environment variables specified in Immich main branch "Breaking Changes" note.
Cycling through alternative ports and port combinations to ensure no other conflict exists - each time the port number at the end of the above "already in use" address changes to show the updated "IMMICH_PORT" or "MACHINE_LEARNING_PORT" when only one was used.
Attempted to offload Machine Learning to an external machine
Removing entire container (and user template) and beginning again

One curious note is that no matter the port specified (with Immich confirmed as accessible via), the port mapping in Unraid's Docker dashboard always shows 8080.

Screenshot 2024-06-17 at 23 20 25

At some point it appears to have been working as I have new thumbnails for ~30 images added after the error began, but nothing new is being generated. Have left it to run overnight at times with nothing but the same error message appearing in the logs the next morning.

Is this likely to be a limitation of the Community Applications plugin? If so, is there a guide on moving from that to a Docker Stack?

hydazz commented 5 months ago

Why are you bonding the container to eth0? This is causing most of the issues you are seeing. Set it to bridge

dbatkinsmith commented 5 months ago

It was set for use remotely (with Tailscale over 5G router with NAT issue).

I have followed your advice and set to bridge, the same issue persists. Any further thoughts?

Error: listen EADDRINUSE: address already in use :::8080
     at Server.setupListenHandle [as _listen2] (node:net:1898:16)
     at listenInCluster (node:net:1946:12)
     at Server.listen (node:net:2044:7)
     at ExpressAdapter.listen (/app/immich/server/node_modules/@nestjs/platform-express/adapters/express-adapter.js:95:32)
     at /app/immich/server/node_modules/@nestjs/core/nest-application.js:180:30
     at new Promise (<anonymous>)
     at NestApplication.listen (/app/immich/server/node_modules/@nestjs/core/nest-application.js:170:16)
     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
     at async bootstrap (/app/immich/server/dist/workers/api.js:53:20) {
   code: 'EADDRINUSE',
   errno: -98,
   syscall: 'listen',
   address: '::',
   port: 8080
 }
 node:internal/process/promises:289
             triggerUncaughtException(err, true /* fromPromise */);
             ^

 Error: listen EADDRINUSE: address already in use :::8080
     at Server.setupListenHandle [as _listen2] (node:net:1898:16)
     at listenInCluster (node:net:1946:12)
     at Server.listen (node:net:2044:7)
     at ExpressAdapter.listen (/app/immich/server/node_modules/@nestjs/platform-express/adapters/express-adapter.js:95:32)
     at /app/immich/server/node_modules/@nestjs/core/nest-application.js:180:30
     at new Promise (<anonymous>)
     at NestApplication.listen (/app/immich/server/node_modules/@nestjs/core/nest-application.js:170:16)
     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
     at async bootstrap (/app/immich/server/dist/workers/api.js:53:20) {
   code: 'EADDRINUSE',
   errno: -98,
   syscall: 'listen',
   address: '::',
   port: 8080
 }

 Node.js v20.14.0
 api worker exited with code 1

hydazz commented 5 months ago

Interesting, can you provide the output of printenv (remove sensitive creds) and does netstat -tulnp | grep :8080 show any other services on port 8080 within the container?

you can also try setting IMMICH_PORT to something else, just make sure that you know that the port set is the port to access the webui, and if using bridge you need to update the binding. leave if using eth0

dbatkinsmith commented 5 months ago

Output from printenv, netstat and docker inspect follow.

I've changed other containers all to bridge now (will sort out Tailscale after this is resolved). The only other docker containers are from Community Applications on Unraid (PostgreSQL_immich and Redis) and are both using default ports. Same issue persists.

Tried changing ports, made no difference. Addition of the IMMICH_PORT variable made the instance inaccessible, didn't bother trying to resolve that just removed the variable.

PRINTENV

DB_PASSWORD=[PASSWORD] IMMICH_WEB_ROOT=/app/immich/server/www REDIS_PASSWORD=[PASSWORD] PUID=99 MACHINE_LEARNING_HOST=0.0.0.0 HOSTNAME=21d9a781a418 S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0 LANGUAGE=en_US.UTF-8 DB_HOSTNAME=[BRIDGED] MACHINE_LEARNING_GPU_ACCELERATION= HOST_CONTAINERNAME=immich UMASK=022 PWD=/ DB_DATABASE_NAME=immich NVIDIA_DRIVER_CAPABILITIES=compute,video,utility NODE_ENV=production TZ=Europe/London DB_PORT=5433 TRANSFORMERS_CACHE=/config/machine-learning/models HOME=/root LANG=en_US.UTF-8 LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=00:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.zst=01;31:.tzst=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.wim=01;31:.swm=01;31:.dwm=01;31:.esd=01;31:.avif=01;35:.jpg=01;35:.jpeg=01;35:.mjpg=01;35:.mjpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.webp=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:~=00;90:#=00;90:.bak=00;90:.old=00;90:.orig=00;90:.part=00;90:.rej=00;90:.swp=00;90:.tmp=00;90:.dpkg-dist=00;90:.dpkg-old=00;90:.ucf-dist=00;90:.ucf-new=00;90:.ucf-old=00;90:.rpmnew=00;90:.rpmorig=00;90:.rpmsave=00;90: PGID=100 VIRTUAL_ENV=/lsiopy IMMICH_MEDIA_LOCATION=/photos S6_VERBOSITY=1 S6_STAGE2_HOOK=/docker-mods IMMICH_PORT=8080 MACHINE_LEARNING_PORT=3003 DB_USERNAME=postgres MACHINE_LEARNING_WORKERS=1 REDIS_HOSTNAME=192.168.0.69 TERM=xterm HOST_OS=Unraid IMMICH_MACHINE_LEARNING_URL=http://127.0.0.1:3003 SHLVL=1 MACHINE_LEARNING_WORKER_TIMEOUT=120 REDIS_PORT=6379 IMMICH_ENV=production MACHINE_LEARNING_CACHE_FOLDER=/config/machine-learning/models PATH=/lsiopy/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTHOSTNAME=[SERVER] =/usr/bin/printenv

NETSTAT

tcp 0 0 0.0.0.0:8088 0.0.0.0: LISTEN 23026/docker-proxy
tcp6 0 0 :::8088 ::: LISTEN 23034/docker-proxy

DOCKER INSPECT

[ { "Id": "sha256:551199a62274b1dae740922a214519dfabb7c8cb6136401d95ffa78f66343228", "RepoTags": [ "ghcr.io/imagegenius/immich:latest" ], "RepoDigests": [ "ghcr.io/imagegenius/immich@sha256:63d924937847e690b8588f2d25ca98c0431504f222ffc7f6d499f69cf2288a57" ], "Parent": "", "Comment": "buildkit.dockerfile.v0", "Created": "2024-06-14T09:14:22.108289257+10:00", "Container": "", "ContainerConfig": { "Hostname": "", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": null, "Cmd": null, "Image": "", "Volumes": null, "WorkingDir": "", "Entrypoint": null, "OnBuild": null, "Labels": null }, "DockerVersion": "", "Author": "", "Config": { "Hostname": "", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "ExposedPorts": { "8080/tcp": {} }, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "PATH=/lsiopy/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "HOME=/root", "LANGUAGE=en_US.UTF-8", "LANG=en_US.UTF-8", "TERM=xterm", "S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0", "S6_VERBOSITY=1", "S6_STAGE2_HOOK=/docker-mods", "VIRTUAL_ENV=/lsiopy", "IMMICH_ENV=production", "IMMICH_MACHINE_LEARNING_URL=http://127.0.0.1:3003", "IMMICH_MEDIA_LOCATION=/photos", "IMMICH_PORT=8080", "IMMICH_WEB_ROOT=/app/immich/server/www", "MACHINE_LEARNING_CACHE_FOLDER=/config/machine-learning/models", "NVIDIA_DRIVER_CAPABILITIES=compute,video,utility", "TRANSFORMERS_CACHE=/config/machine-learning/models", "NODE_ENV=production" ], "Cmd": null, "Image": "", "Volumes": { "/config": {}, "/import": {} }, "WorkingDir": "/", "Entrypoint": [ "/init" ], "OnBuild": null, "Labels": { "build_version": "ImageGenius Version:- v1.106.4-ig290 Build-date:- 2024-06-14T09:02:07+10:00", "maintainer": "hydazz, martabal", "org.opencontainers.image.authors": "imagegenius.io", "org.opencontainers.image.created": "2024-06-14T09:02:07+10:00", "org.opencontainers.image.description": "Immich is a high performance self-hosted photo and video backup solution.", "org.opencontainers.image.licenses": "GPL-3.0-only", "org.opencontainers.image.ref.name": "93592b2fe8f661220fb63a6daad08401afff477c", "org.opencontainers.image.revision": "93592b2fe8f661220fb63a6daad08401afff477c", "org.opencontainers.image.source": "https://github.com/imagegenius/docker-immich", "org.opencontainers.image.title": "Immich", "org.opencontainers.image.url": "https://github.com/imagegenius/docker-immich/packages", "org.opencontainers.image.vendor": "imagegenius.io", "org.opencontainers.image.version": "v1.106.4-ig290" } }, "Architecture": "amd64", "Os": "linux", "Size": 2841076045, "VirtualSize": 2841076045, "GraphDriver": { "Data": null, "Name": "btrfs" }, "RootFS": { "Type": "layers", "Layers": [ "sha256:9c4a468b5b82aa23e55fd1e60030cc3b02e739e651591e8f389bc732696b1db1", "sha256:c2b4ff0f7a07f4699d17ad14ac97fa1521e15e064680592920a231afd880dbe6", "sha256:01a2f38cb83afa854a59b7318a7d5dd7709c69db6e0a00f6ec5c8d6eea19afad", "sha256:4b27f7d17a016be6900d876a6a8616408cbb96628d0067f173a4a20ae8f3bb8f", "sha256:696b38ec20323304de887b4046f8390e7e67cdbbcb5a0549810f375f3e6fbfc8", "sha256:aa01c8405b6085428ee881fe7516942190933bea5093325f087da93f32b38693", "sha256:6059d93496c84f14ccff238bd2fa4af811d8ac1bda893f59368cee7aa74c7b6f", "sha256:a02e8195f10c396b51c29f154d0e77334fb4294b5480218f0d3f8523326f8c9d", "sha256:0ee1f8ec56bab12ff62a9beff541e65d87274eee1114648d80fcc6c4e7a59c49" ] }, "Metadata": { "LastTagTime": "0001-01-01T00:00:00Z" } } ]

martabal commented 4 months ago

Have you set a memory usage limit ?

dbatkinsmith commented 4 months ago

Not to my knowledge, is that necessary or recommended?

martabal commented 4 months ago

Absolutely not, but someone already reported that kind of behavior and the issue was the memory limit

dbatkinsmith commented 4 months ago

I have noticed out of memory errors recently, they're pretty infrequent and random. Here's a sample of the last one earlier this morning.

Jun 19 03:12:36 WhiteWitch kernel: CPU: 3 PID: 8989 Comm: emhttpd Tainted: P           O       6.1.79-Unraid #1
Jun 19 03:12:36 WhiteWitch kernel: Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Z77X-UP4 TH, BIOS F9 10/24/2012
Jun 19 03:12:36 WhiteWitch kernel: Call Trace:
Jun 19 03:12:36 WhiteWitch kernel: <TASK>
Jun 19 03:12:36 WhiteWitch kernel: dump_stack_lvl+0x44/0x5c
Jun 19 03:12:36 WhiteWitch kernel: dump_header+0x4a/0x211
Jun 19 03:12:36 WhiteWitch kernel: oom_kill_process+0x80/0x111
Jun 19 03:12:36 WhiteWitch kernel: out_of_memory+0x3b3/0x3e5

And

Jun 19 03:12:36 WhiteWitch kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/docker/21d9a781a418ac0acd9940020633d9b6c4a4a1a993afe444d8ade5a84595fd53,task=immich,pid=6477,uid=99
Jun 19 03:12:36 WhiteWitch kernel: Out of memory: Killed process 6477 (immich) total-vm:19278796kB, anon-rss:12143268kB, file-rss:0kB, shmem-rss:0kB, UID:99 pgtables:65064kB oom_score_adj:0

Can provide more if it's useful?

martabal commented 4 months ago

Can provide more if it's useful?

I think that's enough. The EADDRINUSE issue is caused by your system running out of memory. My advice would be to lower your job concurrency or disable heavy tasks like machine-learning.

dbatkinsmith commented 4 months ago

Dropping all concurrency to 1 and disabling ML tasks results in the same error with the same oom issue.

Jun 19 21:15:24 WhiteWitch kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/docker/21d9a781a418ac0acd9940020633d9b6c4a4a1a993afe444d8ade5a84595fd53,task=immich,pid=31522,uid=99
Jun 19 21:15:24 WhiteWitch kernel: Out of memory: Killed process 31522 (immich) total-vm:8535200kB, anon-rss:6196128kB, file-rss:0kB, shmem-rss:0kB, UID:99 pgtables:51588kB oom_score_adj:0

martabal commented 4 months ago

How much RAM does your system have ?

dbatkinsmith commented 4 months ago

16GB and has been running for more than a year without issue.

[BTW was unable to reopen so I've linked this issue to a new one, please close if you are happy to continue here.]

martabal commented 4 months ago

And how much RAM is available when this issue occurs?

martabal commented 4 months ago

We moved from having two processes, one for the API and one for the microservice to a single process where the microservice is a node worker.

My theory is that before this change, if the microservice crashed, it was restarted by s6. Now if the microservice crashes, it's detected by s6 and restarted, but somehow the nodejs process with the API is not stopped.

dbatkinsmith commented 4 months ago

And how much RAM is available when this issue occurs?

Usage is typically 20-30% but i'm not monitoring constantly. I can see a new OOM error but the dashboard doesn't show a rise above ˜32%

The processor has been pegged at times but I haven't seen RAM usage rise.

hydazz commented 4 months ago

Can you try ghcr.io/imagegenius/igpipepr-immich:1.106.4-pr-379 and see if it fixes the issue?

dbatkinsmith commented 4 months ago

Can you try ghcr.io/imagegenius/igpipepr-immich:1.106.4-pr-379 and see if it fixes the issue?

This is working well, with concurrency back to defaults and no OOM errors that I can observe.

dbatkinsmith commented 4 months ago

After leaving it running for almost 8 hours I'm getting more OOM errors but the system is working as expected with default concurrency settings.

Upmacc commented 4 months ago

Hey all, Came across this thread after coming across a similar issue on Unraid 6.12.10, had the same error as dbatkinsmith 'ERROR: EADDRINUSE: address already in use :::8080'.

Bit of context around my scenario - Immich is bound to bridge, no OOM issues, no limits set. Tried adding IMMICH_PORT however makes the web ui inaccessible. Changed to ghcr.io/imagegenius/igpipepr-immich:1.106.4-pr-379 per hydazz, without the IMMICH_PORT its the same error, if added there is the below different error (UI isn't available either way):

Node.js v20.14.0
/app/immich/server/node_modules/reflect-metadata/Reflect.js:984
                    providerMap.set(P, provider);
                                ^

TypeError: Method Map.prototype.set called on incompatible receiver #<Map>
    at Map.set (<anonymous>)
    at Object.setProvider (/app/immich/server/node_modules/reflect-metadata/Reflect.js:984:33)
    at GetMetadataProvider (/app/immich/server/node_modules/reflect-metadata/Reflect.js:1163:38)
    at OrdinaryDefineOwnMetadata (/app/immich/server/node_modules/reflect-metadata/Reflect.js:611:28)
    at decorator (/app/immich/server/node_modules/reflect-metadata/Reflect.js:195:17)
    at DecorateConstructor (/app/immich/server/node_modules/reflect-metadata/Reflect.js:549:33)
    at Reflect.decorate (/app/immich/server/node_modules/reflect-metadata/Reflect.js:143:24)
    at Object.__decorate (/app/immich/server/node_modules/tslib/tslib.js:105:96)
    at Object.<anonymous> (/app/immich/server/node_modules/@nestjs/core/discovery/discovery-service.js:95:55)
    at Module._compile (node:internal/modules/cjs/loader:1358:14)

After reverting to the standard image repo ghcr.io/imagegenius/immich no errors while I was logged in to immich Only this while logged in:

2024-06-29 01:09:54.259524361 [E:onnxruntime:Default, env.cc:228 ThreadMain] pthread_setaffinity_np failed for thread: 1190, index: 0, mask: {1, 7, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.

Then noticed this after being idle - during 'Shutting down due to inactivity.'

[06/29/24 01:15:28] INFO Shutting down due to inactivity.  
[06/29/24 01:15:28] INFO Shutting down  
[06/29/24 01:15:28] INFO Error while closing socket [Errno 9] Bad file descriptor  
[06/29/24 01:15:28] INFO Waiting for application shutdown  
[06/29/24 01:15:28] INFO Application shutdown complete  
[06/29/24 01:15:28] INFO Finished server process [1165]  
[06/29/24 01:15:28] ERROR Worker (pid:1165) was sent SIGINT!  
[06/29/24 01:15:28] INFO Booting worker with pid: 1362  
[06/29/24 01:15:31] INFO Started server process [1362]  
[06/29/24 01:15:31] INFO Waiting for application startup.

Hope some/any of this may be helpful, cheers.

imagegenius / docker-immich

ERROR: EADDRINUSE: address already in use :::8080 following v1.106.x updates #374

PRINTENV

NETSTAT

DOCKER INSPECT