Repeatable job ignores selected queue and runs on default

striveforbest commented 5 years ago

I have a Repeatable job created via admin to run every 15 minutes on a scheduled queue. See screenshot1:

However, looking at the Queues in the admin, it's apparent that job is running on the default queue instead. See screenshot2:

Separately, the admin doesn't report successfully finished jobs (besides one repeated job). I can confirm many jobs are succeeding but not showing up in the admin. See screenshot3:

marianstefi20 commented 5 years ago

This will be a mouthful, but if you want to see the solution, skip this list and go to Solution.

In admin.py I can see the RepeatableJobAdmin has in fieldsets RQ Settings with the field queue. The choices are set in the QueueMixin, from the QUEUES constant. From here, the RepeatableJob model should have it's queue set when we click on save. The save method is also overwritten in the BaseJob class ( IMHO, this might not be the best idea, I think Django has save_model, but I might be wrong here) - it basically does a normal save() but first, it unschedules, and then schedules a task back, by calling self.schedule().
Looking at RepeatableJob(models.py line 147) we can see that the schedule() actually builds a set of kwargs and then calls self.scheduler().schedule(**kwargs). This scheduler is defined in the BaseJob class, and it's actually django_rq.get_scheduler(self.queue).
This get_scheduler, if no RQ with a custom SCHEDULER_CLASS is present in settings, will default to DjangoScheduler, a class defined above get_scheduler, in django_rq/queues.py. This DjangoScheduler will override the _create_job method from Scheduler, to make some checks, but later it will still call the base _create_job.
We're now in rq_scheduler/scheduler.py at the _create_job method. We see at the end, that it sets job.origin = queue_name or self.queue_name. This will set the custom queue you've set(due to django_rq.get_scheduler(self.queue) from django_rq_scheduler's scheduler()).
Because the rq_scheduler called _create_job inside schedule with commit=False the job won't be saved YET. Some more checks are made, and then, we reach job.save() in scheduler.py. This will use redis hmset and will save a mapping to that particular job id used as redis key. This mapping will contain our origin (because in rq/job.py the method to_dict contains that). In the next line, the job is added in scheduled_jobs_key set. Cool...until now, nothing seems bad.
Next, I wanted to see how the jobs are consumed... python manage.py rqscheduler. This is the broker that takes the jobs from redis and puts them in the corresponding queue. This is, of course, a different process, so when this will be initialized the queue will be set to default. To see why, look at rq_scheduler/scheduler.py's __init__(). In this case, no queue_name is specified so the default will be set. We also see that enqueue_job, the method that consumes the jobs will at one point do queue = self.get_queue_for_job(job) which will return immediately, because self._queue is set to the default queue. So all the jobs will go to the default.

Solution: run python manage.py rqscheduler --queue=<name> (of course this is not in the documentation of django-rq). Funky solution: (I'm almost joking here) fork rq_scheduler and change scheduler.py's get_queue_for_job(self, job) function...just a little: From this:

def get_queue_for_job(self, job):
        """
        Returns a queue to put job into.
        """
        if self._queue is not None:
            return self._queue
        key = '{0}{1}'.format(self.queue_class.redis_queue_namespace_prefix,
                              job.origin)
        return self.queue_class.from_queue_key(
                key, connection=self.connection, job_class=self.job_class)

To this:

def get_queue_for_job(self, job):
        """
        Returns a queue to put job into.
        """
        if job.origin is not None:
            return job.origin
        if self._queue is not None:
            return self._queue
        key = '{0}{1}'.format(self.queue_class.redis_queue_namespace_prefix,
                              job.origin)
        return self.queue_class.from_queue_key(
                key, connection=self.connection, job_class=self.job_class)

striveforbest commented 5 years ago

@marianstefi20 thanks for digging in.

I am running scheduler via systemd. I will try to update my systemd service to:

[Unit]
Description=Django-RQ Scheduler Service
After=network.target

[Service]
Environment=DJANGO_SETTINGS_MODULE=proj.settings
Environment=DJANGO_CONFIGURATION=Production
WorkingDirectory=/srv/www/proj/
ExecStart=/srv/www/fuweb/.venv/bin/python /srv/www/proj/manage.py rqscheduler --queue=scheduled

[Install]
WantedBy=multi-user.target

However, I think it should be considered as bug.

marianstefi20 commented 5 years ago

At a closer inspection, the master branch from rq-scheduler already solved this issue, by removing:

if self._queue is not None:
            return self._queue

from get_queue_for_job.

Try updating rq-scheduler to the latest version. If you look at https://github.com/rq/rq-scheduler/commits/master you will see they added yesterday something (Add queue_name to enqueue_in and enqueue_at), but most importantly, they made a new release 0.9.1. This might do it.

marianstefi20 commented 5 years ago

I've updated rq-scheduler and the problem with the queues has been fixed. I can confirm that the admin doesn't report the finished jobs correctly (I'm manually deleting the jobs, but they still show up).

striveforbest commented 5 years ago

@marianstefi20 thanks for the update. I will upgrade and re-test. Bummer it still doesn't report the finished jobs correctly.

islco / django-rq-scheduler

Repeatable job ignores selected queue and runs on default #36