Open rrrnld opened 2 months ago
@rrrnld we honestly did not use the helm chart with the most recent Saleor versions, as we did move to the cloud deployment, but it did work before. So currently I don't have the capacity to test that again, but we will most likely try the self-hosted deployment again in the future.. we added this liveness checks to make really sure, that the workers are alive and the redis connection is still active and that used to work fine. How does the being stuck look like to you?
We're self-hosting saleor and running into issues with our celery deployment, where the worker appears to get stuck after a while. We're deploying to k8s and run celery workers like this:
This is taken from the config that was removed here: https://github.com/saleor/saleor/pull/13777
I can see the worker processes are running. It's also what this repo uses to deploy saleor: https://github.com/trieb-work/helm-charts/blob/fbe6ce6748c449f4a8889fa653063cafad3a4303/charts/saleor/templates/celery_deployment.yaml#L26-L52
Is this the correct way to? I'm asking because
celery -A saleor --app=saleor.celeryconf:app
is redundant for example. Also, shelling into the container and trying to inspect it viacelery -A saleor --app=saleor.celeryconf:app inspect active
orcelery -A saleor --app=saleor.celeryconf:app status
both fail, and the lifetime check here in this repo does not seem to be working at all.Any idea what might be wrong with our healthchecks / lifetime checks?