celery / django-celery-beat

Celery Periodic Tasks backed by the Django ORM
Other
1.61k stars 418 forks source link

Multiple instances #337

Open eikebartels opened 4 years ago

eikebartels commented 4 years ago

Behaviour with multiple instance

I could not find anything in the documentation that's why this ticket.

I have multiple instances running and right now I have the problem that CELERY_BEAT_SCHEDULE tasks get scheduled on every instance. While searching for a solution I came to this repo.

Does django-celery-beat supports multi instance environments? Means n instances but only one of them schedules the beat tasks?

I hope someone can answer me this question.

Cheers and stay healthy

Jufik commented 4 years ago

Just did a try locally, Running multiple "celery beat" will send task multiple times.

charleswhchan commented 4 years ago

You can add a lock to prevent multiple instances of beat from running in parallel.

Also see https://blog.heroku.com/redbeat-celery-beat-scheduler:

Finally, we added a simple lock that prevents multiple Beat daemons from running concurrently. This can sometimes be a problem for Heroku customers when they scale up from a single worker or during development.

maciejstromich commented 3 years ago

I just stumbled upon a similar issue. We're running our with celery worker --beat .. and this means that for every ecs instance we will have a beat service up and running. It would be awesome to have a locking mechanism relying on the database. I guess the simplest solution will be to build a locking relying on memcached as suggested in https://docs.celeryproject.org/en/latest/tutorials/task-cookbook.html#cookbook-task-serial

crutcha commented 3 years ago

Having a locking mechanism relying on the database can be tricky for a few reasons, most importantly being you're probably relying on timestamps instead of a TTL value to handle lock expiration. I have a similar desire for this functionality though. Currently using redbeat which handles multiple beat workers, but i'd rather have task definitions be in the database and currently django-celery-results can't handle multiple beat instances. It would be nice for cases where you're using redis as the broker to also be able to leverage redis for a locking mechanism(like redbeat) but still have tasks defined in database giving you the best of both worlds.

I've thrown together a quick example of what this may look like(https://github.com/celery/django-celery-beat/compare/master...crutcha:beatlock) but wanted to see if this is a change y'all would be receptive to before continuing any further. Let me know your thoughts.

elieserme commented 6 days ago

Hello! Any development on this?