cern-sis / issues-inspire

0 stars 0 forks source link

Investigate how to deal with not responding celery workers #339

Closed MJedr closed 1 year ago

MJedr commented 1 year ago

Often we have an issue that next workers are non-responsive (can't connect to rabbitmq). We need to find a way to either restart the weird state or crash the container so that k8s restarts it automatically.

MJedr commented 1 year ago

There are multiple ways to handle it, see https://github.com/celery/celery/issues/4079 For now, I added a liveness probe (using celery ping), let's see if that helps. If not, we can check other ways mentioned in the issue.