The current parallelization implementation means that (assuming no errors), each serial batch takes as long as the longest parallel sub-step. For example, if our $Step is 1 hour and our $Parallelism is 24, each serial batch is 1 day. If most hour sub-steps are quite small but there's regularly a spike from 12PM-1PM, then the long midday batch will cause all other parallel threads to wait until it has completed before the next serial batch can begin processing.
Implement a materialized queue system such that a thread receiving a batch can immediately proceed to another without waiting on a serial partner.
The current parallelization implementation means that (assuming no errors), each serial batch takes as long as the longest parallel sub-step. For example, if our
$Step
is 1 hour and our$Parallelism
is 24, each serial batch is 1 day. If most hour sub-steps are quite small but there's regularly a spike from 12PM-1PM, then the long midday batch will cause all other parallel threads to wait until it has completed before the next serial batch can begin processing.Implement a materialized queue system such that a thread receiving a batch can immediately proceed to another without waiting on a serial partner.