ml6team / fondant

Production-ready data processing made easy and shareable
https://fondant.ai/en/stable/
Apache License 2.0
341 stars 25 forks source link

Split dataframe to write different subsets #51

Closed PhilippeMoussalli closed 1 year ago

PhilippeMoussalli commented 1 year ago

Currently, the user returns a single Fondant dataframe when loading or transforming data. However, there is a need to split this dataframe into multiple subsets based on component specifications and save each subset into a separate location.

This task involves creating a process that allows for efficient and accurate splitting of the dataframe while ensuring each subset is written to its designated location.

PhilippeMoussalli commented 1 year ago

Include many computations in each compute call.

https://docs.dask.org/en/stable/delayed-best-practices.html#:~:text=Compute%20on%20lots%20of%20computation%20at%20once%C2%B6

Image