Closed MegaShinySnivy closed 3 months ago
I doubt this is it, but just to rule it out, can you try with redis 6.2 like we use in the default setup? I'd also be interesting in the redis logs.
Here are the logs pre-revert to 6.2...
1:C 08 Jun 2024 07:08:16.359 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 08 Jun 2024 07:08:16.359 # Redis version=7.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 08 Jun 2024 07:08:16.359 # Configuration loaded
1:M 08 Jun 2024 07:08:16.359 * monotonic clock: POSIX clock_gettime
1:M 08 Jun 2024 07:08:16.360 * Running mode=standalone, port=6379.
1:M 08 Jun 2024 07:08:16.360 # Server initialized
1:M 08 Jun 2024 07:08:16.370 * Reading RDB base file on AOF loading...
1:M 08 Jun 2024 07:08:16.370 * Loading RDB produced by version 7.0.11
1:M 08 Jun 2024 07:08:16.370 * RDB age 379632 seconds
1:M 08 Jun 2024 07:08:16.370 * RDB memory usage when created 7.94 Mb
1:M 08 Jun 2024 07:08:16.370 * RDB is base AOF
1:M 08 Jun 2024 07:08:16.439 * Done loading RDB, keys loaded: 69, keys expired: 0.
1:M 08 Jun 2024 07:08:16.439 * DB loaded from base file appendonly.aof.32.base.rdb: 0.075 seconds
1:M 08 Jun 2024 07:08:17.439 * DB loaded from incr file appendonly.aof.32.incr.aof: 1.000 seconds
1:M 08 Jun 2024 07:08:17.439 * DB loaded from append only file: 1.075 seconds
1:M 08 Jun 2024 07:08:17.439 * Opening AOF incr file appendonly.aof.32.incr.aof on server start
1:M 08 Jun 2024 07:08:17.439 * Ready to accept connections
And post downgrade.
1:C 11 Jun 2024 20:52:40.055 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Jun 2024 20:52:40.055 # Redis version=6.2.14, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Jun 2024 20:52:40.055 # Configuration loaded
1:M 11 Jun 2024 20:52:40.056 * monotonic clock: POSIX clock_gettime
1:M 11 Jun 2024 20:52:40.057 * Running mode=standalone, port=6379.
1:M 11 Jun 2024 20:52:40.058 # Server initialized
1:M 11 Jun 2024 20:52:40.058 * Ready to accept connections
Tested, sadly no change
I'm experiencing the same issue. Updated stack with an image pull for the new version. Whenever I go to queue a job on the server the 'Waiting' value increments but no action follows. Additionally, this admin panel will freeze at times when attempting to queue a job or change a setting, requiring a full node restart or re-compose.
Actions completed in attempt to correct the issue, to no avail:
immich_redis
containerimmich_microservices
container instead of relying on the new ad-hoc spawnimmich_redis
container:1:C 11 Jun 2024 21:28:11.223 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 11 Jun 2024 21:28:11.223 # Redis version=6.2.14, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 11 Jun 2024 21:28:11.223 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 11 Jun 2024 21:28:11.223 * monotonic clock: POSIX clock_gettime
1:M 11 Jun 2024 21:28:11.223 * Running mode=standalone, port=6379.
1:M 11 Jun 2024 21:28:11.223 # Server initialized
1:M 11 Jun 2024 21:28:11.223 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 11 Jun 2024 21:28:11.223 * Ready to accept connections
Notably different from @MegaShinySnivy logs, I appear not to have a config file set and don't know where to locate said file. I assume the memory overcommit warning is relevant to the lack of this config file, as this attribute would likely be set by the conf.
@applealias03, are you running this on K8s or docker compose?
@applealias03, are you running this on K8s or docker compose?
Compose. Here is the script I've been using as of this most recent update:
name: immich
services:
immich-server:
container_name: immich_server
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
command: ['start.sh', 'immich']
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- stack.env
ports:
- 2283:3001
depends_on:
- redis
- database
environment:
IMMICH_WORKERS_INCLUDE: 'api'
restart: always
immich-microservices:
container_name: immich_microservices
image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
command: ['start.sh', 'immich']
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- stack.env
depends_on:
- redis
- database
environment:
IMMICH_WORKERS_EXCLUDE: 'api'
restart: always
immich-machine-learning:
container_name: immich_machine_learning
# For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
# extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
# file: hwaccel.ml.yml
# service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
volumes:
- model-cache:/cache
env_file:
- stack.env
restart: always
redis:
container_name: immich_redis
image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:84882e87b54734154586e5f8abd4dce69fe7311315e2fc6d67c29614c8de2672
restart: always
database:
container_name: immich_postgres
image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: ${DB_USERNAME}
POSTGRES_DB: ${DB_DATABASE_NAME}
POSTGRES_INITDB_ARGS: '--data-checksums'
volumes:
- ${DB_DATA_LOCATION}:/var/lib/postgresql/data
restart: always
command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]
volumes:
model-cache:
SWAG, but it could be the immich helm chart sets a configuration that isn't included with the compose.
Noticed a difference between the SHA keys for the redis image in the newest docker-compose.yml
and my compose.
Also the new health-check. I thought the difference in SHA might relate to a minor update revision, but making this replacement has not affected the output.
I just upgraded from v1.105.1 am now having the same issue as @applealias03. Nothing unusual at all in the logs and Redis is ready to accept connections, just an increment of the job waiting count with no work being performed.
I just upgraded from v1.105.1 am now having the same issue as @applealias03. Nothing unusual at all in the logs and Redis is ready to accept connections, just an increment of the job waiting count with no work being performed.
Solved, it forgot to remove command: ['start.sh', 'immich']
from the immich-server service in the compose script. Removing that and recomposing resolved the issue.
I believe I have managed to entirely fix this issue. I now have an active Smart Search job running and all outputs seem to be correct.
Here are the steps I went through in order to correct this:
immich_microservices
and immich_redis
containers if they are still present.
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
name: immich
services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
volumes:
- ${UPLOAD_LOCATION}:/usr/src/app/upload
- /etc/localtime:/etc/localtime:ro
env_file:
- .env
ports:
- 2283:3001
depends_on:
- redis
- database
restart: always
immich-machine-learning: container_name: immich_machine_learning
# Example tag: ${IMMICH_VERSION:-release}-cuda
image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
# extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
# file: hwaccel.ml.yml
# service: cpu # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
volumes:
- model-cache:/cache
env_file:
- .env
restart: always
redis: container_name: immich_redis image: docker.io/redis:6.2-alpine@sha256:d6c2911ac51b289db208767581a5d154544f2b2fe4914ea5056443f62dc6e900 healthcheck: test: redis-cli ping || exit 1 restart: always
database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} POSTGRES_INITDB_ARGS: '--data-checksums' volumes:
${DB_DATA_LOCATION}:/var/lib/postgresql/data
command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"] restart: always
volumes: model-cache:
- The most notable changes in this compose file are:
1. Entire removal of the `immich_microservices` section
2. Removal of the start command for the server environment
3. New redis-alpine image `@sha256:d6c2911ac51b289db208767581a5d154544f2b2fe4914ea5056443f62dc6e900`
4. Addition of the healthcheck for `immich_redis`
After this and `docker compose up` everything seems to now be running smoothly.
I just upgraded from v1.105.1 am now having the same issue as @applealias03. Nothing unusual at all in the logs and Redis is ready to accept connections, just an increment of the job waiting count with no work being performed.
Solved, it forgot to remove
command: ['start.sh', 'immich']
from the immich-server service in the compose script. Removing that and recomposing resolved the issue.
Didn't see you basically fixed it the same way! Glad this worked out for you too.
Ahh, the container defaults to running start.sh by default. What is it recommended to be set to now?
Ahh, the container defaults to running start.sh by default. What is it recommended to be set to now?
Changelog recommends to omit the line entirely. See: https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
Strange. I just checked over my pod yaml. There's no command key. At all.
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/restartedAt: "2024-05-09T23:31:09-05:00"
creationTimestamp: "2024-06-11T20:11:25Z"
generateName: immich-server-6f946b75d6-
labels:
app.kubernetes.io/instance: immich
app.kubernetes.io/name: server
pod-template-hash: 6f946b75d6
name: immich-server-6f946b75d6-5q79z
namespace: immich
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: immich-server-6f946b75d6
uid: 2181f8e7-10b8-4250-a2ab-467b3d0420f0
resourceVersion: "181100313"
uid: 882598a0-88df-4446-9d4d-923e7aba3842
spec:
automountServiceAccountToken: true
containers:
- env:
- name: DB_DATABASE_NAME
value: immich
- name: DB_HOSTNAME
valueFrom:
secretKeyRef:
key: host
name: postgres-immich-app
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: postgres-immich-app
- name: DB_URL
valueFrom:
secretKeyRef:
key: uri
name: postgres-immich-app
- name: DB_USERNAME
valueFrom:
secretKeyRef:
key: user
name: postgres-immich-app
- name: IMMICH_LOG_LEVEL
value: verbose
- name: IMMICH_MACHINE_LEARNING_URL
value: http://immich-machine-learning:3003
- name: IMMICH_METRICS
value: "true"
- name: REDIS_HOSTNAME
value: immich-redis-master
image: ghcr.io/immich-app/immich-server:v1.106.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /api/server-info/ping
port: http
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: immich-server
ports:
- containerPort: 3001
name: http
protocol: TCP
- containerPort: 8081
name: metrics-api
protocol: TCP
- containerPort: 8082
name: metrics-ms
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /api/server-info/ping
port: http
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 30m
memory: 600Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/src/app/upload
name: library
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-wx99g
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: k8s-worker-3
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: library
persistentVolumeClaim:
claimName: immich-nfs
- name: kube-api-access-wx99g
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
Update: Hashed some things out over discord, it was a combination of a custom set of error pages and my WAF interfering.
The bug
Attempting to run any jobs via the admin screen or edit settings causes the server to give a 405. My session also no longer seems to be persisting properly, as if I return to the base URL (immich.mydomain.com instead of immich.mydomain.com/photos) it kicks me back out. In an attempt to debug further, I turned up the logging information. However, this yielded nothing relevant other than ping messages and websocket connect/disconnects.
The OS that Immich Server is running on
Debian 12
Version of Immich Server
v1.106.2
Version of Immich Mobile App
N/A
Platform with the issue
Your docker-compose.yml content
Your .env content
Reproduction steps
Relevant log output
Additional information
As said, this is running on K3s. If you want more information, see https://github.com/MegaShinySnivy/Scaling-Snakes