rails / solid_queue

Database-backed Active Job backend
MIT License
1.65k stars 91 forks source link

Concurrency Blocked Jobs do not respect queue priority #237

Open pasl opened 1 month ago

pasl commented 1 month ago

Hi

I have Jobs with concurrency limit and 4 queues setup.

For that Job I must ensure only 4 jobs can run simultaneous.

I want to schedule Jobs in different queues (background and best_effort).

I found that once the Jobs are moved to the Blocked Jobs because of their concurrency limit, then the queue are not prioritize anymore.

Here is my solid queue config file:

default: &default
  dispatchers:
    - polling_interval: 1
      batch_size: 500
      concurrency_maintenance_interval: 30 # wait 30 seconds for concurrency to be maintained (default 600)
  workers:
    - queues: [ real_time, "*", background, best_effort ]
      threads: 5
      processes: 1
      polling_interval: 0.1

Here is my Job class:

class GetPredictionJob < ApplicationJob
  queue_as :background
  limits_concurrency to: 4, key: 'pyseer_app', group: 'pyseer_app', duration: 30.seconds

To test I do the following:

# Schedule many Jobs with to lowest priority queue (best_effort)
1000.times do
  GetPredictionJob.set(queue: :best_effort).perform_later(...)
end

# Schedule 1 Job with a highest priority queue (background)
GetPredictionJob.perform_later(...)

I thought the job in the background queue should run as soon as a spot is cleared in the worker, but all 1000 best_effort jobs have been executed first.

It is something by design or maybe a bad configuration ? Does reducing the batch_size can help with this specific problem ?

Thanks

rosa commented 1 month ago

Hey @pasl, thanks for opening this issue and trying out Solid Queue! I think the concurrency controls don't play a role here, I think what happens is that your worker is set to run all queues, besides background and best_effort, as "*" is included. If "*" is included, your configuration behaves in the same way as:

workers:
    - queues: "*"

So the order followed is first numeric priority (which in your case it's irrelevant because you're using queue-based priority) and then job_id, so the jobs you enqueued first get executed first.

For the order you want, I think you could try with this configuration instead:

workers: 
  - queues: [ real_time, background, best_effort ]
pasl commented 1 month ago

Hi @rosa,

Thank you for your help.

I was under the impression that if we put "*" inside a worker's queue configuration, it means that this worker will pick up jobs from any queues.

My goal with this configuration was to ensure all jobs sent to any queue other than real_time, background, or best_effort, like default for example, would be picked up by the worker and executed with middle priority.

Actually, this configuration ~works as expected~ when I am not using the limits_concurrency feature. All jobs are executed in the order of the queue position from real_time to * (anything else) to background, and then best_effort.

[edit] Not true... You were right. After further testing with Jobs without concurrency, Queues are not prioritize when "" is present. Without the "" queues are now working as expected for normal Jobs. [/edit]

However, when I add the limits_concurrency feature, the queue position is not being considered, and jobs are executed in their job ID order.

New test (with priority):

# Schedule many jobs with priority 10 inside the background queue
1000.times do
  GetPredictionJob.set(queue: :background, priority: 10).perform_later(...)
end

# Schedule 1 job with priority 1 inside the background queue
GetPredictionJob.set(queue: :background, priority: 1).perform_later(...)

This time, the latest job passed in front of the others.

So, I think that once the job has been placed into the blocked job because of limits_concurrency, the job's priority is being kept, but not the queue order.

rosa commented 1 month ago

Thanks for checking this, @pasl, and sorry for the delay. Let me try to understand the issue now.

So, I think that once the job has been placed into the blocked job because of limits_concurrency, the job's priority is being kept, but not the queue order.

Does this still happen when you're using

- queues: [ real_time, background, best_effort ]

as configuration?

pasl commented 1 month ago

Hi, correct.

My configuration is slightly different, but the * is removed.

 - queues: [ real_time, default, background, best_effort ]

And yes, from what I can see, once the jobs have been sent to the blocked list because they have been limited by the concurrency rules, queues don't seem to matter in terms of priority. However, job priority will be honored.

rosa commented 1 month ago

Oh, I see the problem now, yes. This is correct; only priority is honored when unblocking jobs, because picking the next job to be unblocked only takes priority into account, not queue. The queue can't be taken into account because the job doesn't know the worker's order or queue priority order, it just knows about its own queue and concurrency key, and can only pick a job to unblock based on its concurrency key. Then, once the jobs are unblocked, they'll be picked from the queue in queue priority order, but what happens in your case is that jobs aren't unblocked in that order, so the effect in the end is that queue doesn't matter.

I don't think we can change this without completely changing how concurrency controls work, so I'll document this instead.