Open sebhoerl opened 4 years ago
I managed that in one of my forked branches: https://github.com/eqasim-org/synpp/compare/develop...ainar:synpp:multiprocessing
I encompassed each stage execution in a multiprocessing's Process. The processes are launched as soon as enough resources are available and the executions of their dependencies are done.
In my fork, the total amount of resources and the amount of resources needed for each stage still need to be configurable.
Often, the tree structure of the pipelines allows to run things in parallel. Right now the pipeline runs one stage at a time. To make use of parallel computing power, a couple of steps are necessary:
1) Let user define resource availability via configuration, e.g.
2) Let user define resource requirements, e.g.
3) Run stages in parallel in the pipeline. There is a caveat: We can not start slave processes from within slave processes. This means if a stage makes use of the
parallel()
context, it should not already be in a child process! Therefore, we need to put some thoughts and intelligent management of the process pool. (In particular, it would need to be managed centrally by the pipeline instead of perParallelMasterContext
object).