Closed thariq-shanavas closed 1 year ago
Same here, got this message at least twice at initial startup, but seems to have resolved itself after some retries. Face recognition also seems to be working, the other options are disabled just like for OP.
I noticed it a couple hours after the update, and it had not resolved itself. It probably restarted hundreds of times in that time frame. A reboot did not fix it either.
I had the same issue (also noticed after hours) but in my case a docker compose down --remove-orphans
and docker compose up -d
solved it for me...
Cc @mertalev
Looks like gunicorn gives workers 30s to start and terminates them if they don't start within this time. It might take longer than this for a worker to start on very slow CPUs. Setting --timeout
to a higher number should fix it, maybe 120?
Not sure if this is the same issue, but just did a clean install (v1.77.0) on a clean docker container with the stack file from the site. The machine-learning container won't finish the download and is stuck in a loop downloading over and over again.
There's plenty of CPU and mem for the container, but it's cutting off the download after 29 seconds.
/usr/local/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
[09/08/23 15:15:14] ERROR Worker (pid:121) was sent code 134!
[09/08/23 15:15:14] INFO Booting worker with pid: 135
[09/08/23 15:15:21] INFO Created in-memory cache with unloading disabled.
[09/08/23 15:15:21] INFO Initialized request thread pool with 12 threads.
09/08/23 15:15:21] INFO Downloading facial-recognition model 'buffalo_l'.This may take a while.
09/08/23 15:15:21] WARNING Failed to load facial-recognition model
buffalo_l'.Clearing cache and retrying.
[09/08/23 15:15:21] INFO Cleared cache directory for model 'buffalo_l'.
[09/08/23 15:15:21] INFO Downloading facial-recognition model 'buffalo_l'.This may take a while.
Downloading /cache/facial-recognition/buffalo_l/buffalo_l.zip from https://github.com/deepinsight/insightface/releases/download/v0.7/buffalo_l.zip...
18%|█▊ | 50581/281857 [00:05<00:26, 8850.62KB/s]=```
OK, my bad .. thought this fix was already published/live but it wasn't.
Editten the start.sh file with the timeout and it started to work.
The bug
I upgraded Immich and the machine learning container fails to start. Output of
sudo docker logs -f immich_machine_learning
[09/06/23 10:35:59] INFO Booting worker with pid: 4585
[09/06/23 10:36:29] CRITICAL WORKER TIMEOUT (pid:4585)
[09/06/23 10:36:31] ERROR Worker (pid:4585) was sent SIGKILL! Perhaps out of memory?
The container tries to restart, then fails with the same timeout error. I suspect a bug from https://github.com/immich-app/immich/pull/3934
I'm running on a system with 2 GB RAM (with 1 GB ZRAM and 1GB swap), so I've enabled only face recognition among the machine learning features. The processor is an Intel Atom Z8350. It works great in v1.76.1
In my .env file, I have pinned the version to v1.76.1 until this is resolved. Thank you all so much for this amazing software! I'll be happy to post any other logs as needed.
The OS that Immich Server is running on
Debian 12
Version of Immich Server
v1.77.0
Version of Immich Mobile App
NA
Platform with the issue
Your docker-compose.yml content
Your .env content
Reproduction steps
Additional information
No response