ssec-jhu / dplutils

Distributed(Data) Pipeline Uitilities
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Handle batching and multiple source tasks in stream graph executor #60

Closed amitschang closed 3 weeks ago

amitschang commented 5 months ago

The current behavior in stream graph executor is to feed source batches one-by-one to ready source tasks, meaning that sources act different from a forked branch in that they don't all get the same inputs (where forked branch will send batch to both outputs).

for input batches that have no particular meaning, this is OK, but with the support of data generators (https://github.com/ssec-jhu/dplutils/pull/59) this should be sorted out. This will have implications for batching, particularly in the case where the batch sizes of the source tasks differ.