Open malefice opened 4 years ago
@malefice Hey I filed some error regarding the PIDfiles in celery. Somehow, right after celery v4.4.2, the PIDFile was being placed incorrectly and stuff.
I haven't used celery in awhile, but that might be the case. In my opinion, your only chance is to first delete the pid file before running compose up and see if the problem arises again. Otherwise, go into compose/production/django/celery and find a start file that should try to be deleting a pid file... I think, not sure.
Found this question which seems to be related?
I'm not sure why we remove it locally but not on production to be honest. One guess might be because we mount the local directory in /app
:
I presume the error happens on production because the stack reuse the same celerybeat
container, you'd need to remove the containers between 2 runs.
Anyway, this answer suggests to disable the pid file by passing an empty value --pidfile=
. I have no idea of the implications of it, but maybe we could probably bring our dev/prod a bit more in line with that.
From celery documentation
--pidfile File used to store the process pid. Defaults to celerybeat.pid.
The program won’t start if this file already exists and the pid is still alive.
The only reason this file is persisting between celerybeat and celeryworker service restarts is because of yml templating. celerybeat and celeryworker services get the bind mounted current directory from django as seen below: https://github.com/pydanny/cookiecutter-django/blob/9b67d828f68a7d145400f19da119f11bb6830fe3/%7B%7Bcookiecutter.project_slug%7D%7D/local.yml#L19-L20
The easiest way to solve this is to redefine it to be an empty array like so
celerybeat:
<<: *django
image: {{ cookiecutter.project_slug }}_local_celerybeat
container_name: celerybeat
depends_on:
- redis
- postgres
{% if cookiecutter.use_mailhog == 'y' -%}
- mailhog
{%- endif %}
ports: []
command: /start-celerybeat
volumes: []
What happened?
The
celerybeat
service in production can fail to run because of an existing pidfile.What should've happened instead?
The development version does not suffer from this issue, because its start script manually removes the pidfile. I am not sure why the production version doesn't, so if I am missing some details, please weigh in.
Ideally,
celery
should properly clean up after itself, and it does attempt to detect if the pidfile is stale or not, but for some reason, it does not always work when dockerized. I have never encountered this issue in traditional setups, so this is probably an upstreamcelery
and/or docker issue. On that note, a quick workaround is to manually remove the pidfile.Steps to reproduce
Tested on Ubuntu 18.04.4 LTS, Docker Engine version 19.03.12, docker-compose version 1.17.1
Some screenshots