celery / django-celery-beat

Celery Periodic Tasks backed by the Django ORM
Other
1.7k stars 429 forks source link

Django Celery Beat admin updating Cron Schedule Periodic task not taking effect? #7

Closed kendazheng closed 7 years ago

kendazheng commented 7 years ago

like this http://stackoverflow.com/questions/40579804/django-celery-beat-admin-updating-cron-schedule-periodic-task-not-taking-effect

wen96 commented 7 years ago

Same issue here. Whenever I update or create a PeriodickTask I have to reload beat to get this changes updated on worker. There a process (I guess pulling process) every 3 minutes on beat saying Writing entries... but nothing happens if you don't restart beat. Are we missing something?

Using: celery 4.0.0 Django 1.9.9 django-celery-beat 1.0.1 Ubuntu 14.04 Redis 2.8.4

Configuration:

BROKER_URL = 'redis://localhost:6379/0'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERYBEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
kendazheng commented 7 years ago

I have test other celery version and djcelery, it work well. So, it's a bug ? I have read the source code already. The DatabaseScheduler property method 'schedule' have received the changes, but the method 'reserve' can Not get it. Because I do not know about the project,so I do not know why.

mheppner commented 7 years ago

Not sure if this helps, but with Celery 4, your setting name should be prefixed. BROKER_URL should be CELERY_BROKER_URL and CELERYBEAT_SCHEDULER should be CELERY_BEAT_SCHEDULER. I don't think you need to specify JSON for any of the serializers because it should be default now.

wen96 commented 7 years ago

@mheppner yes I know, config was just a copy paste which contains lines which comes from older Celery versions. The worker and beat process shows config on start which is OK and it reads and writes on Redis and on Django database correctly. DB changes simply don't come to reserve function as @kendazheng says. Anyway, thank you.

mheppner commented 7 years ago

@wen96 that's what I figured, but I thought I'd check anyways. I tried using the django-celery-beat scheduler after I made that comment and I also experienced the same issue. A full restart is needed. I'll look at it tonight and see if I can figure out why.

andrewgleave commented 7 years ago

Same here. Any further developments?

mheppner commented 7 years ago

It doesn't look like the scheduler itself is ever checked again and I'm not sure why. I think the beat process itself needs to access DatabaseScheduler.schedule() on each tick for it to even detect if something changed, but it's never accessed after the first time.

exploreshaifali commented 7 years ago

Exact same issue I am facing, and it is working well with celery==3.1.23 and djcelery.

t-io commented 7 years ago

I had the same issue because I forgot to start the worker with -S django like celery -A proj beat -l info -S django http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html

okfish commented 7 years ago

I had the same issue with all options set to worker properly, but had investigated that the schedule has been updated after adding new record and has not been updated after modifying existing record of intervals

okfish commented 7 years ago

However, it was just a message in logs like DatabaseScheduler: Schedule changed. But nothing was really changed in the schedule, nor new task wasn't added neither old task wasn't modified

roofcat commented 7 years ago

I had the same issue, i use command celery -A app beat -l info -S django, but tasks doesn't run. My celery config file:

""" from future import absolute_import import os from django.conf import settings from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings') app = Celery('track') app.config_from_object('django.conf:settings') app.autodiscover_tasks(lambda: settings.INSTALLED_APPS) app.conf.update( CELERY_BROKER_URL='amqp://guest:guest@localhost//', CELERY_ACCEPT_CONTENT=['json'], CELERY_TASK_SERIALIZER='json', CELERY_RESULT_SERIALIZER='json', ) """ my tasks file: """ import logging from app import celery_app

from .models import Note

logger = logging.getLogger(name)

@celery_app.task def create_note(): logger.info('running tasks') Note.new_note() logger.info('tasks finished') """

The create_note method run if i execute in django shell but celery cannont run staks, why???

lucadelmonte commented 7 years ago

Hi, same issue here, when modifying a task from admin or even via shell nothings happens, when adding or removing celery beat logs "DatabaseScheduler: Schedule changed." and logs the right scheduleed task, but task are note being executed ( if added to db) or keep exececuting ( if removed from db) $ pip freeze amqp==2.1.1
anyjson==0.3.3
Babel==2.3.4
billiard==3.5.0.2
celery==4.0.0
Django==1.10.3
django-celery-beat==1.0.1
django-celery-results==1.0.1
flower==0.9.1
kombu==4.0.0
pkg-resources==0.0.0
pytz==2016.7
tornado==4.2
vine==1.1.3

lucadelmonte commented 7 years ago

@roofcat Are you running the command "celery -A app worker -l info" too, in another shell? Celery beat is the scheduler, celery worker spawns a woker which can process the task scheduled by beat. This issue is different from yours, we manage to get the tasks running, our changes to the table periodictask are not picked up by the beat scheduler unless we restart it.

roofcat commented 7 years ago

@Squalluca ohh thanks, IT WORKS!!!! i have not realized

rjbks commented 7 years ago

Any updates on this, I am also experiencing this issue. Changes after beat has started log a : DatabaseScheduler: Schedule changed. message but the changes do not actually take effect. Newly scheduled tasks dont run and disabled tasks continue to run.

roofcat commented 7 years ago

I have a question with celerybeat daemon doesn't run

My /etc/default/celeryd file: `ENV_PYTHON="/home/crojas/.virtualenvs/az-track/bin/python"

CELERY_BIN="/home/crojas/.virtualenvs/az-track/bin/celery"

CELERYD_CHDIR="/home/crojas/az-track"

CELERY_APP_ARG="app"

export DJANGO_SETTINGS_MODULE="app.settings"

CELERYD_NODES=3

CELERYD_OPTS="--concurrency=3"

CELERYD_LOG_LEVEL="INFO"

CELERY_CONFIG_MODULE="app.celery"

CELERYD_LOG_FILE="/var/log/celery/%n.log"

CELERYD_PID_FILE="/var/run/celery/%n.pid"

CELERYD_USER="celery"

CELERYD_GROUP="celery"

CELERY_CREATE_DIRS=1 `

And my /etc/default/celerybeat file: `CELERY_BIN="/home/crojas/.virtualenvs/az-track/bin/celery"

CELERY_APP_ARG="app"

CELERYBEAT_CONFIG_MODULE="app.celery"

CELERYBEAT_OPTS="-S django"

CELERYBEAT_LOG_LEVEL="INFO"

CELERYBEAT_USER="celery"

CELERYBEAT_GROUP="celery"

CELERY_CREATE_DIRS=1 ` My celery worker daemon it works, but celery beat cannot run, why? thanks all

mheppner commented 7 years ago

@roofcat if you're using celery 4 and django-celery-beat, you need to follow the upgrade guide and change your settings to the new format.

roofcat commented 7 years ago

@mheppner i can run celery beat with manual command successful, but cannot run with daemon :(

the error is: WARNING/MainProcess] DBAccessError WARNING/MainProcess] (13, 'Permission denied')

in my celery.py config: """ from future import absolute_import import os from django.conf import settings from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'app.settings')

app = Celery('test')

app.config_from_object('django.conf:settings') app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)

app.conf.update( broker_url='amqp://guest:guest@localhost//', beat_scheduler='django_celery_beat.schedulers.DatabaseScheduler', )

jottunn commented 7 years ago

having the same issue, celery 4.0, django 1.10.2, django_celery_beat 1.0.1

roofcat commented 7 years ago

Is normal celery beat cannot run if celery worker is running as daemon??

my ubuntu journalctl log:

Nov 25 15:06:30 az-track-test sudo[8232]: crojas : TTY=pts/1 ; PWD=/home/crojas/az-track ; USER=root ; COMMAND=/bin/su celery -c /home/crojas/.virtualenvs/az-track/bin/celery beat --app=app.celery -f /var/log/celery/beat.log -l INFO --workdir=/home/crojas/az-track --detach --pidfile=/var/run/celery/beat.pid Nov 25 15:06:30 az-track-test sudo[8232]: pam_unix(sudo:session): session opened for user root by crojas(uid=0) Nov 25 15:06:30 az-track-test su[8233]: Successful su for celery by root Nov 25 15:06:30 az-track-test su[8233]: + /dev/pts/1 root:celery Nov 25 15:06:30 az-track-test su[8233]: pam_unix(su:session): session opened for user celery by crojas(uid=0) Nov 25 15:06:30 az-track-test su[8233]: pam_systemd(su:session): Cannot create session: Already running in a session Nov 25 15:06:30 az-track-test su[8233]: pam_unix(su:session): session closed for user celery Nov 25 15:06:30 az-track-test sudo[8232]: pam_unix(sudo:session): session closed for user root

thanks!

frnhr commented 7 years ago

Same issue as OP, after save in admin, the beat process outputs DatabaseScheduler: Schedule changed. but keeps scheduling the task as if nothing changed. No difference if using crontab or interval config.

Let's keep the discussion focused on the OP's issue, otherwise it might seem solved when it's not.

lucadelmonte commented 7 years ago

hey @ask , sorry to bother, can we have an update on this? are we doing something wrong? is this expected or is this a bug?

vladitot commented 7 years ago

Hi there, I have same problem. My celery beat works correctly, but on updating period of running, celery beat keep previous period, so I need to restart beat to load new settings to beat. I believe anyone has any ideas

jegesh commented 7 years ago

+1

roofcat commented 7 years ago

+1

wzywno commented 7 years ago

Does anyone has workaround or solution for that?

AlwxSin commented 7 years ago

Does anyone know which versions of celery and celery-beat works properly together? Or maybe which commits?

roofcat commented 7 years ago

django-celery-beat works with celery 4, django-celery (djcelery) works with 3.x

mef13 commented 7 years ago

+1

jegesh commented 7 years ago

I switched my project over to django-rq for now. It took me about three hours to learn the technology and make all the necessary changes. It's really simple to get started with.

TomaszKolek commented 7 years ago

Guys, do you have any workaround?

mheppner commented 7 years ago

This is a bug. Please stop adding "+1", suggesting alternative frameworks, asking for working versions, and seeing if people have a workaround. If you are having issues not related to this bug, try posting your question in the user group first.

TomaszKolek commented 7 years ago

Because I needed to do it ASAP and let's say quality of my solution wasn't important - I found a problem in celery not in django-celery-beat. If it will help:

celery/beat.pyclass Service method tick:

H = self._heap
if H is None:
    H = self._heap = [event_t(_when(e, e.is_due()[1]) or 0, 5, e)
                                 for e in values(self.schedule)]
   heapify(H)

It looks H is None only in first execution of this function and never again. So I added in __init__.py one variable old_schedueles (None by default) Next I added to above if H is None additional part if H is None or len(self.old_schedules) != len(values(self.schedule)) - it actualy means: go to code under if if H is None or if new schedules appeared. At the end I added inside the if self.old_schedulers = values(self.schedules)

Because I did some more operations for debugging - is possible that I did something else. I will try to reproduce my investigation and let you know.

Hope it will help.

frnhr commented 7 years ago

Probably related: https://github.com/celery/celery/issues/3493 "How to dynamically add or remove tasks to celerybeat?"

wen96 commented 7 years ago

@frnhr it is not. This is about a bug on "celerybeat-schedule shelve database". The issue you mention is just a question which I think is already answered and should be close.

wzywno commented 7 years ago

In my dirty workaround I override tick method in class DatabaseScheduler(Scheduler): (schedulers.py file) And in tick method I commented out

        if H is None:

from

        if H is None:
            H = self._heap = [event_t(_when(e, e.is_due()[1]) or 0, 5, e) for e in values(self.schedule)]
            heapify(H)

so now heap is built in every single run of tick method.

TomaszKolek commented 7 years ago

so it looks my solution is a little bit better ;-) But its great to hear that its enough.

wzywno commented 7 years ago

I found another way to fix this issue. When DatabaseScheduler decides that sync is required the _heap in Scheduler is set to None. so _heap is rebuilt on next run of tick() method.

So now DatabaseScheduler looks like that:

    @property
    def schedule(self):
        update = False
        if not self._initial_read:
            debug('DatabaseScheduler: initial read')
            update = True
            self._initial_read = True
        elif self.schedule_changed():
            info('DatabaseScheduler: Schedule changed.')
            update = True

        if update:
            self.sync()
            self._schedule = self.all_as_schedule()
            if logger.isEnabledFor(logging.DEBUG):
                debug('Current schedule:\n%s', '\n'.join(
                    repr(entry) for entry in values(self._schedule)),
                )
            # _heap will be rebuilt on the next run
            super(Scheduler, self).__setattr__('_heap', None)
        return self._schedule

super(Scheduler, self).__setattr__('_heap', None) is added in comparison to original version.

I created PR with fix. It is a bit more elegant than workaround :)

razerraz commented 7 years ago

@wzywno Both of your fixes doesn't work for me, but thanks for the effort @TomaszKolek Your fix work for me, but you've written a bunch of typos. To be clear :: in celery/beat.py, class Service, constructor (init) -> add self.old_schedulers = None and method tick becomes

def tick(self, event_t=event_t, min=min,
             heappop=heapq.heappop, heappush=heapq.heappush,
             heapify=heapq.heapify, mktime=time.mktime):
        """Run a tick - one iteration of the scheduler.

        Executes one due task per call.

        Returns:
            float: preferred delay in seconds for next call.
        """
        def _when(entry, next_time_to_run):
            return (mktime(entry.schedule.now().timetuple()) +
                    (adjust(next_time_to_run) or 0))

        adjust = self.adjust
        max_interval = self.max_interval
        H = self._heap
        if H is None or len(self.old_schedulers) != len(values(self.schedule)):
            H = self._heap = [event_t(_when(e, e.is_due()[1]) or 0, 5, e)
                          for e in values(self.schedule)]
            heapify(H)
            self.old_schedulers = values(self.schedule)
        if not H:
            return max_interval

        event = H[0]
        entry = event[2]
        is_due, next_time_to_run = self.is_due(entry)
        if is_due:
            verify = heappop(H)
            if verify is event:
                next_entry = self.reserve(entry)
                self.apply_entry(entry, producer=self.producer)
                heappush(H, event_t(_when(next_entry, next_time_to_run),
                                    event[1], next_entry))
                return 0
            else:
                heappush(H, verify)
                return min(verify[0], max_interval)
        return min(adjust(next_time_to_run) or max_interval, max_interval)
Coread commented 7 years ago

@razerraz The comparison of between old and new schedules is insufficient. If the schedule intervals are changed then the required heapify will still not take place. The below code shows a comparison I used for IntervalSchedules. Crontab schedules would require a more involved check of equality.

def schedules_equal(self, a, b):
    if a.keys() != b.keys():
        return False
    for name, model in a.items():
        b_model = b.get(name)
        if not b_model:
            return False
        if not hasattr(model.schedule, "human_seconds"):
            # Not checking crontabs
            continue
        if model.schedule.human_seconds != b_model.schedule.human_seconds:
            return False
    return True

so instead of the length check

if H is None or not self.schedules_equal(self.old_schedulers,self.schedule):

and then

self.old_schedulers = self.schedule
razerraz commented 7 years ago

seems to make sence, thanks

frnhr commented 7 years ago

This problem is still present on 4.0.2 and on master, tested just now.

bgroff commented 7 years ago

I can also confirm that I am still seeing this issue with 4.0.2.

codefo commented 7 years ago

Me too, both on master and 4.0.2

oortega commented 7 years ago

I have the same issue, fix temporaly with crontab

crontab -e

60 /usr/bin/supervisorctl restart { instance name Beat}

dmitryTsatsarin commented 7 years ago

+1. Celery 4.0.2. Still exists (need restart).

wzywno commented 7 years ago

I forked celery and added @Coread changes. It works with those changes. in requirements.txt

git+git://github.com/HealthTechDevelopers/celery@master#celery
dmitryTsatsarin commented 7 years ago

@wzywno Could you please make pull request to original branch? Thanks

hmphu commented 7 years ago

Hi @wzywno, I have tried your repo, that work extremely well. Could you please make a pull request to the original branch? Thanks