Currently pipeliner is greedy: as soon as router announces a new pipeline, it gets this pipeline to take care of.
This approach, for a first point of view, can lead to a greater throughput, but if we analyse it better, it is not a good thing to do, because:
Pipeliner's process will stop with MemoryError exception as soon as there are so many pipelines to use the maximum amount of memory allowed by the operating system;
As the list of pipelines in Pipeliner will be huge, it'll slow down Pipeliner's operation and can lead in delays to start new jobs, send messages of 'finished pipelines' etc.
We (maybe?) can have problems with huge queues on ZeroMQ (think in 1M pipelines finishing -- the amount of used memory of this queue and the time Pipeliner will 'stop' doing things just to process it).
So, the new approach should be like Broker's: limit the number of pipelines that Pipeliner takes care at a time. This number can be fixed or dynamic (dynamic is better, I think) and probably will be based on:
Currently pipeliner is greedy: as soon as router announces a new pipeline, it gets this pipeline to take care of. This approach, for a first point of view, can lead to a greater throughput, but if we analyse it better, it is not a good thing to do, because:
MemoryError
exception as soon as there are so many pipelines to use the maximum amount of memory allowed by the operating system;So, the new approach should be like Broker's: limit the number of pipelines that Pipeliner takes care at a time. This number can be fixed or dynamic (dynamic is better, I think) and probably will be based on: