When ExperimentDataPipe iterates through cells, the obs_joinids are not recorded anywhere. This isn't important for training but it's necessary if the same datapipe is used for a forward pass (e.g. when generating embeddings). The current _obs_joinids field can be used but:
It requires shuffling to be off.
Doesn't work with multiple workers, since they don't process in order.
When
ExperimentDataPipe
iterates through cells, the obs_joinids are not recorded anywhere. This isn't important for training but it's necessary if the same datapipe is used for a forward pass (e.g. when generating embeddings). The current_obs_joinids
field can be used but: