rq / django-rq

A simple app that provides django integration for RQ (Redis Queue)
MIT License
1.81k stars 282 forks source link

Duplicates of jobs being added to queue when using unique ID #517

Open spetoolio opened 2 years ago

spetoolio commented 2 years ago

I have a recurring job that runs a "check" on a certain object. I enqueue a job with a specific job id, built from a job key and the object's UUID, such as check_object_{object_uuid}.

Then, when the recurring task runs, I make sure that the job isn't already in the queue, if these criteria are true I want to queue up a job:

  1. No job exists with that ID
  2. Job does exist, but previous execution finished (either failed or completed)

Here's the code, with different variable/function names:

   for obj in objects:
        job_id = obj.get_check_job_id()  #  returns the unique id for job & object as described above
        try:
            job = get_job_by_id(job_id)  # returns any job object with that ID. always returns a single job object
            if job.is_finished is False and job.get_status() != "failed":
                # failed jobs will stay in the failed queue for awhile, and
                # is_finished is False for failed jobs. So we need to check these
                # separately so we can re-queue failed jobs, and jobs that are finished.
                continue
        except NoSuchJobError:
            # no job means we should go ahead and queue it.
            pass
        scraper_queue.enqueue(
            f=run_job_for_task,
            obj=obj,
            job_id=job_id
        )

However, for a reason I cannot determine, duplicates of the task end up being queued, and then duplicates of the duplicates, in an exponential way. I'll see jobs with status finished and queued, both in the queue, with the same ID.

This seems like a bug that multiple jobs with the same ID can be queued up. But, it seems like when I run this function:

        scraper_queue.enqueue(
            f=run_job_for_task,
            obj=obj,
            job_id=job_id
        )

It seems that all jobs in the queue, finished queue, or failed queue all get re-queued if they have the same ID. Any advice on how I can manage this?

aehlke commented 1 year ago

did you solve this?

spetoolio commented 1 year ago

Unfortunately not. Are you experiencing it as well?

When we dropped the TTL on the jobs to be very, very small, it helped a bit. Not solved though. We just clear out the queue of duplicates every so often... it sucks

aehlke commented 1 year ago

no just evaluating tech solutions. thank you

to be honest I've decided to push as much queue management as possible out to the clients and try to avoid most backend async tasks. much easier to deal with ensuring end to end completion of a task without data loss, with recoverability, back pressure, accurate status etc.