EDIT June 6, 2017: See the epic planning issue #149 (this issue 143 has been revised based on discussion in #149 and earthio issue 16
Goals:
Improve parallelism of operations within what is now elm.pipeline.Pipeline (and will become earthio.pipeline.Pipeline, not just between Pipeline instances of an ensemble. Currently elm does parallelism where a Pipeline instance is sent to a worker and that worker runs all preprocessors of the Pipeline's steps. It may be better to express the preprocessors in a dask graph, so the scheduler may choose how to break it up
EDIT June 6, 2017: See the epic planning issue #149 (this issue 143 has been revised based on discussion in #149 and
earthio
issue 16Goals:
elm.pipeline.Pipeline
(and will becomeearthio.pipeline.Pipeline
, not just betweenPipeline
instances of an ensemble. Currentlyelm
does parallelism where aPipeline
instance is sent to a worker and that worker runs all preprocessors of thePipeline
's steps. It may be better to express the preprocessors in a dask graph, so the scheduler may choose how to break it up