Open rayward opened 7 years ago
While an interesting concept, and I'm by no means expressing disapproval of the feature, the existing behavior is standard in both Ruby Resque (of which this is a port, albeit currently mirroring an early version), and in most other queueing engines available for most languages. Though there likely exceptions, of course.
The way to process jobs in multiple queues at more-or-less equal priorities (why multiple queues at all, then, as the only purpose they serve internally is prioritizing work?) is to run multiple workers (which you seem to already be doing) with different orders for their queue lists (which isn't possible with just *
, of course). Or, in many cases, with just a single queue in each list, so each worker is dedicated to a single queue (though you can still have multiple workers per queue, of course).
And again, this is true of most queueing libraries for most languages. Queues exist, in these cases, to prioritize work.
That said, I still find this approach interesting, and it could be used to add further value to this library. I'd be interested in seeing a Reserver that allows queues to be assigned priorities in addition to names, so that queues can still be prioritized higher than others, but so that multiple queues of equal priority have a roughly equal chance of having jobs processed. That way, you can keep the benefits of splitting jobs across queues for reasons other than prioritizing them (though I'm still not clear on what those are), as well as the benefits of higher/lower priority queues. Add that, and I'd be jumping to recommend a merge.
@danhunsaker thanks for the feedback.
Our infrastructure configuration necessitates a complex worker/queue setup and unfortunately we can't prioritise by just adding more workers for each individual queue (due to the load it would add to other infrastructure components).
I'd be interested in seeing a Reserver that allows queues to be assigned priorities in addition to names
Yeh I also considered that sort of implementation and I agree that it probably would supersede the RandomQueueOrderReserver
. It would however require a more complex environmental configuration of the queues to be able to specify those priorities and I feel that sort of change is a bit outside the scope of this PR. The foundations are here though for others to add those new behaviours.
From a first pass at this - the changes look good, and I can't pick any backwards incompatibility issues we should be aware of.
These changes also address a common message I'm hearing - usually via email from someone using php-resque who doesn't fully understand how dynamic queues work. They'll typically ask my why they have jobs in queues that aren't being processed yet, and it's because they have a queue in their list sooner that has a lot of backed up work. It's pretty simple to say "break that out into its own worker", but there are use cases like we've uncovered where you really do what a single worker (or multiple) to be working a whole list of queues, "fairly" - prioritization of what's in those queues is less important, but sharding and distribution is.
I'm okay with a change like this because it's not breaking any compatibility with Resque itself. The web UI, etc all still work as expected, as does the ability to enqueue a job or process a job in another language runtime. The default reservation behaviour still matches that which Resque guarantees (ordered emptying and FIFO job processing).
I see this as a good start for other possible reservation/prioritization strategies as both of you guys have already mentioned.
I need to put a more thorough set of eyes over this before we bring it down, but I think I want that to be soon.
@rayward: can we get this branch of this deployed into our environment to give it a good beating before I merge this down?
I've also included a fix for autoloading.
Composer prepends autoloaders such that the last autoloader always takes precedence. If APP_INCLUDE
is used and that app itself requires php-resque, then the app's autoloader will result in the classes being loaded from the app rather than from here.
If the version of resque included by the app is older than this version, then many unintended problems can occur, such as:
The change in the last commit causes php-resque's autoaloader to be re-registered so that when Resque_Worker
and other classes are instantiated, they will use the correct definitions.
Also looks like there's a pre-existing issue with hhvm that's causing the travis build to fail - https://github.com/facebook/hhvm/issues/6286
Will see if I can implement a work-around.
@rayward The changes here are too extensive for me to comfortably merge into the new upstream directly. Is there any chance you could open a new PR against resque/php-resque with these changes, so I can get more eyes on it in the new context?
Currently php-resque is capable of reserving jobs in two ways:
BLOCKING=1
is configured in the environment: The redisblpop
command is used to do a blocking wait for a job to be enqueued on the configured queues. The order of processing is the same as above.The behaviour of processing a queue entirely before moving to the next is restrictive. Additionally, if resque is configured to dynamically retrieve the available queues (by specifying
QUEUE=*
), then there is even less capability to prioritise as the retrieved queues are sorted alphabetically.We have resque workers configured to process multiple queues but we consider them all equal priority. There is no way to configure this right now and it means that a single queue can dominate all the workers entirely, ignoring all others until it is processed. We occasionally experience surges where a queue may be loaded with 10's of thousands (or even more) of jobs which be result in a substantial delay on being able to process the other queues.
This PR introduces the concept of a Reserver (via
ReserverInterface
). A reserver exposes areserve()
method which is called by theResque_Worker
. Various behaviours can then be implemented in thereserve()
method, such as to allow more fair queue processing for example.Three Reservers are implemented:
QueueOrderReserver
- This reimplements the default behaviour described above, processing a higher priority queue entirely before moving to the next.BlockListPopReserver
- This reimplements the behaviour that is toggled by specifyingBLOCKING=1
.RandomQueueOrderReserver
- Shuffles the list of queues on each call toreserve()
. This is a simple way to allow an equal amount of processing on all the queues.New and interesting ways of reservation could be implemented in the future such as:
enqueue_time
ReserverInterface
is now a required dependency ofResque_Worker
and is supplied in its constructor. The$blocking
parameter to thework()
andreserve()
methods are removed.A reserver is chosen in
bin/resque
based off the environment configuration:BLOCKING
- A non empty value will select theBlockListPopReserver
RESERVER
- The name of any implemented reserver can be specified in snake case format, eg.queue_order
,random_queue_order
,blocking_list_pop
If neither is specified, the default
QueueOrderReserver
is used.While I've tested this a fair bit myself, it would be great if others could give this some verification also.