Open btbonval opened 9 years ago
This sort of feature would need to be supported in celery beat somehow. Maybe there's a singleton schedule method or task quota or something.
These two tasks serve no purpose being queued in multiple: https://github.com/FinalsClub/karmaworld/blob/321c9fd8be8eb3776ee56f628d2918d689ac4e2c/karmaworld/settings/prod.py#L92-L99
Periodic task fields. Options are anything supported by apply_async()
.
apply_async()
supports an expiry time. We could set the expiration to 1 days (24 hours), which would only allow between 1 and 2 instances of any particular daily update task to remain in the queue.
expires
must be "as seconds after task publish" or a timestamp. Timestamp is not feasible because expires
is set just one time at server load.
Something like this for 24 hour expiration after the task is published:
'update-scoreboard': {
'task': 'fix_note_counts',
'schedule': timedelta(days=1),
'options': {'expires': 86400},
},
This ticket is an example of how much can be done while waiting for the staging system queue to complete.
still waiting...
... and since this is a second Heroku worker off the side of the main web worker, we're being charged for every minute or hour it runs. So we're wasting money recalculating these update statistics repeatedly and without any merit, since the first run and last run and every run in between will yield the same basic results for update tasks.
Applied expiration to all 3 periodic tasks, since none have any reason to build up a backlog. The code was put in a branch and pushed to beta for testing at the time of this issue comment.
Check back in a few days or a week and see how much has accumulated in the queue backlog. It should just be 3 tasks: one of each.
There are occasions when tasks do not run for long periods of time as a matter of course. This is typical in a dev environment, but is a constant feature of our staging system.
Certain tasks, like
fix_note_counts
are set to run every 24 hours to update the cache. However, running it 32 times because it has been 32 days since the worker ran is not beneficial. Be it 32 days or just 1, runningfix_note_counts
one time will bring the data to completion.Certain other tasks, like tweets about a new note, are distinct and should be run.
Is there any way to create classes of tasks that queue in certain ways? If so, this should be implemented. Any update tasks only need to get queued one time; any more is wasteful.