flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.42k stars 581 forks source link

[Core feature] Skyplane flytekit plugin #3005

Open pingsutw opened 1 year ago

pingsutw commented 1 year ago

Motivation: Why do you think this is important?

SkyPlane make transferring data much faster and cheaper. Currently, we use awscli to upload/download data from s3, and awscli doesn't have a good performance on I/O. However, We can probably reduce overhead on I/O if we replace awscli with Skyplane.

We could add a Skyplane flytekit plugin first, and extend DataPersistence to implement a new persistence plugin. Finally, we should test it to see how much time we can save on I/O.

Goal: What should the final outcome look like, ideally?

Use Skyplane to upload / download the flyte literal by default if people install the Skyplane plugin.

Describe alternatives you've considered

Use awscli, which is what we already have now.

Propose: Link/Inline OR Additional context

Skyplane: 110x faster data transfers on any cloud

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

kumare3 commented 1 year ago

the only thing is that it seems skyplane needs to launch a machine

wild-endeavor commented 1 year ago

we can explore this more this quarter as we work on the data story but i think arrowfs should already be a lot better. but i'll spend some time playing around with this after that work is done.

github-actions[bot] commented 11 months ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 11 months ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 2 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏