Open grantr opened 9 years ago
Ah ha! I will take a look @mauricio
:+1:
@mauricio Here's what I would do differently:
BatchPayload
object that contains multiple payloads. The rest of Qu can remain blissfully unaware of batches (for now).BatchJob
class specifically for handling multiple payloads.BatchPayload
is an array wrapper that knows how to dispatch multiple payloads. For standard Qu::Job
classes it processes each payload one at a time. Payloads for BatchJob
classes are bundled up and sent to the job as an array. The job implements each
to iterate through the payload list.
class BulkWrite < Qu::BatchJob
batch_size 50
def perform
set_up_batch_write
each do |arg1, arg2|
do_something_with_payload(arg1, arg2)
end
complete_batch_write
end
end
@grantr that would simplify the implementation a lot.
Another problem is, how do you know if a queue produces batch jobs or not?
This was one of the main complications for my implementation, having to pull and push stuff back to the queue when they are "single" jobs instead of batch jobs. My usage back at the time was a single use queue, so I didn't have to care about this much, but if you're running on top of a general queue this could give you trouble as clients mix batch and non-batch jobs at the same place.
Why would you implement batch pushes separately?
Seems like a very simple solution to have given you have a backend that supports them, you just push many messages at once instead of one at a time.
how do you know if a queue produces batch jobs or not?
IMO this is the backend's responsibility. The backend can decide whether it will pull payloads in bulk from the queue service. If it so decides, then the BatchPayload
it creates may contain payloads for multiple jobs. When it is processing the payloads, it can look at each job to see whether it accepts batches or not. If so, it groups all the like payloads into a single job and performs once. If not, it performs each job individually.
This decouples batch pop from batch process, and keeps the perform logic in the *Payload
classes. There's never a need to return jobs to the queue, because the fallback is to perform all payloads in sequence as if they were not batched.
Batch push is separate because it doesn't have anything to do with batch processing (IMHO). Consumers don't need to know if producers have batch push, and producers don't need to know if consumers have batch pop.
Some jobs would benefit from being able to process a bunch of payloads at once. Specifically, a job that writes documents to a data store could see a dramatic performance increase by sending 50 or 100 documents per request instead of 1. Since it may not be possible for the producer to enqueue entire batches in a single payload, the queue framework needs to handle popping multiple payloads and collecting them into a batch job.
This could work well with batch pop support in the backend, but doesn't require it. Even if the backend only supports popping one payload at a time, a job might still want a batch of payloads.