contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.71k stars 228 forks source link

Is there a way to do fair queueing or round robin with Faktory? #392

Closed brandonweiss closed 1 year ago

brandonweiss commented 2 years ago

I might not be using the right terminology, but “fair queuing” or “round robin” is most often what I hear this model called. The idea is that instead of working a queue FIFO (first in, first out), the queue is partitioned in some way, and then workers rotate through the partitions, one at a time. So for example, the most common partition would probably be a user or organization ID.

The idea with this is to prevent one user or organization from hogging the entire queue. This is important in systems where users have the ability to intentionally or unintentionally tell the system to do large amounts of work. For example, consider an exception monitoring service or maybe a test runner. If a flood of exceptions/tests suddenly come in for one user, and then a second user has a very small number of exceptions/tests, you wouldn’t want the second user to have to wait minutes or hours for all of the first users work to get done before their work can get done.

Does that make sense? Is there some way to do this currently? Perhaps with a middleware or something? Or is this maybe a feature that might be added natively in the future? Thanks!

mperham commented 2 years ago

You can do this without Faktory knowing anything about it. You'd build a higher-level abstraction of a "fair queue" which maps to 8 queues in Faktory. The client, on each call to fetch a job from queue q, would actually call FETCH q0 q1 q2 q3 q4 q5 q6 q7. Now here's the trick: every time the client calls FETCH it would randomize the order of the 8 queues and every time the client pushes a job to a queue, it would select one of the eight randomly to push the job to. In that way, you should see an even load of PUSHes and FETCHes from the underlying queues.

faktory_worker_ruby can be told to randomly fetch from a set of queues by giving them all the same weight:

faktory_worker_ruby -q q0,1 -q q1,1 -q q2,1 -q q3,1 ...

faktory_worker_go can do the same via the Manager's ProcessWeightedPriorityQueues API. You'd implement the random queue selection when pushing the job yourself.

brandonweiss commented 2 years ago

Thanks for getting back to me! Okay, hmm, let me just make sure I’m understanding this correctly.

So hypothetically if you had 10 queues and Big Customer comes along and enqueues 100 jobs, each of which takes a minute, then each queue would have approximately 10 jobs in it. And then immediately Small Customer comes along and enqueues 1 job, leaving one queue with 11 jobs in it. If you had 10 workers, in theory Small Customer would have to wait 10 minutes before their job runs, right?

Now if there was 1 queue with 101 jobs in it and 10 workers, then Small Customer would still have to wait 10 minutes, right?

Am I thinking about this correctly or have I completely misunderstood? It seems like making multiple queues and randomizing the enqueues/dequeues would still allow one customer to potentially hog the system?

mperham commented 2 years ago

Correct. Read this for one solution:

https://www.mikeperham.com/2019/12/17/workload-isolation-with-queue-sharding/

brandonweiss commented 2 years ago

Ah, that might work! Thanks!