Koed00 / django-q

A multiprocessing distributed task queue for Django
https://django-q.readthedocs.org
MIT License
1.83k stars 289 forks source link

Run task every 20 seconds? #179

Open frnhr opened 8 years ago

frnhr commented 8 years ago

From reading the docs and a quick look at the code, it seems like 1 minute is the shortest possible interval between tasks. Are there any external obstacles to using seconds instead of minutes as the base time measure for scheduled tasks?

Koed00 commented 8 years ago

The current design has a heartbeat of 30 seconds, which means the schedule table can't have schedules below that. Most of this is explained in the architecture docs. Because of the way the internal loop is set up, a resolution under a dozen seconds or so, quickly becomes unreliable.

I always imagined tasks that need accuracy measured in seconds, would use a delayed tasks strategy where a few seconds delay is either added through the broker or inside the task itself.

The problem with all this, is that a task is currently always added to the back of the queue. So even with a 1 second resolution on the schedule, the task still has to wait it's execution time. Which can of course vary wildly depending on the broker type, worker capacity and current workload.

frnhr commented 8 years ago

So basically, if I understood correctly, the main concern is that brokers might be too slow. The architecture looks really really nice, but I can't say I fully understand it (yet). For instance, I don't see why adding tasks to the back of the queue would be an issue.

In any case, I have made a few changes to get django-q to use seconds instead of minutes, and if seems to be working well. I'm only trying with the ORM broker. Here is the PR: https://github.com/Koed00/django-q/pull/181 It is not really a pull request - rather I'm wondering what might go wrong with these changes? They seem to work quite well for my case (and I intend to use so modified django-q), but I'm interested into why not make it the default behaviour - why limit yourself to 1 minute resolution. Smells of crontab for no reason 😦


BTW, I'll be running it on a Raspberry Pi, with SQLite db. The tasks are going to be quick and need to be run about every 20 seconds. 1 worker should be enough. Avoiding multiple processes (e.g. running Redis) seems like a good idea, that's why using ORM broker.

Urth commented 8 years ago

Regarding the heartbeat, It should be noted that with the introduction of GUARD_CYCLE and the fixed counter in https://github.com/Koed00/django-q/blob/master/django_q/cluster.py#L231 the 30 second heartbeat is not guaranteed. (before this patch the heartbeat seemed to be 15 seconds, 0.5 * 30) It seems scheduled tasks may incur a significant delay if GUARD_CYCLE is set to >2 seconds.

tarkatronic commented 7 years ago

Has there been any more work/thought/progress on this? I'm currently doing an evaluation of task queues, and django-q is a front runner, but I need to be able to have a task polling the database on a very tight schedule, probably around every 10 seconds.

Uninen commented 4 years ago

What's the status of sub-minute tasks with django-q? I'm re-evaluating our tech stack and would love to migrate away from Celery-based tasks, but we need some to run every 15-20 seconds and it seems that this cannot be done with django-q at the moment. Is there a workaround or is this just out of scope here?

I'd also be interested hearing if someone has experience in other Django-based solutions that would do the job. I imagine sub-minute (not sub-second or anything crazy like that) repeating tasks wouldn't be that exotic these days.

frnhr commented 4 years ago

Personally I’m just resigned to using Celery, even on projects where it is an overkill of criminal proportions. There are some hacks to make the setup simpler, e.g. https://www.distributedpython.com/2018/07/03/simple-celery-setup/

This method is only viable if you have a single worker and your tasks can tolerate occasional file access conflict, so YMMV.

I find it useful for running a single task every few seconds on a Raspberry Pi, with having the broker directory mounted to a tmpfs path (to reduce wear on the SD card).

For anything larger, Docker is your friend. Or mine, at least :)

Stephane-Ag commented 1 year ago

In case anyone is still interested in this later on, why not just have the schedule be every 1/2 min and then, in your code, just have a sleep for 20sec (or whatever you need) and call the function/script again (repeated as many times as needed before the next schedule call). Seems simple enough.