facebookresearch / spdl

Scalable and Performant Data Loading
BSD 2-Clause "Simplified" License
44 stars 2 forks source link

Support generator in main process incrementally #229

Closed mthrok closed 1 month ago

mthrok commented 1 month ago

Before this change, when a generator function is passed to pipeline, the pipeline converts the function so that output won't be propagated until the generator gets exhausted.

This is to support the case for running generator in subprocess, in which case, the ProcessPoolExecutor or asyncio does not support returning yielded items one-by-one.

This restriction is not applicable if running the generator in the main process.

This commits change the behavior for the case where the generator is executed in the main process, so that the items yielded are passed done immediately.

facebook-github-bot commented 1 month ago

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.