add a data struct specifically designed to handle a stream of data
a stream of data consists of one or more file(s) and/or directories that are grouped together and to be treated as a single unit
Most typically, a data stream results from a specific pod result (and thus corresponds to a specific PodJob/PodRun)
However, streams can also arise from InputSource
A data stream can also be split or combined with other data stream(s) (e.g. via join operation) to yield different sets of data streams
Consequently, there is not necessarily a one-to-one correspondence between a data stream and PodOutput, thus requiring a distinct struct to maintain
Note that a single PodJob only operates on a single data stream. If a Pod node on the pipeline graph receives multiple data streams, an instance of the pod will be scheduled to run (as a PodJob) for each data stream in parallel
InputSource
PodJob
only operates on a single data stream. If a Pod node on the pipeline graph receives multiple data streams, an instance of the pod will be scheduled to run (as aPodJob
) for each data stream in parallel