project-codeflare / codeflare

Simplifying the definition and execution, scaling and deployment of pipelines on the cloud.
https://codeflare.dev
Apache License 2.0
217 stars 36 forks source link

Data splitter #33

Open raghukiran1224 opened 3 years ago

raghukiran1224 commented 3 years ago

Overview

As a CFP user, I would like to split a dataset (e.g., np array, pandas dataframe) into smaller objects that can then be fed into other nodes/pipeline. This is especially useful when we have compute intensive tasks and would like to parallelize it easily.

Acceptance Criteria

Questions

Assumptions

Reference

raghukiran1224 commented 2 years ago

The basic utility has been added, putting it as an actual node needs more work.