flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.43k stars 581 forks source link

Creating new spark_sql_task #1417

Open yunhao-qing opened 3 years ago

yunhao-qing commented 3 years ago

Why would this plugin be helpful to the Flyte community

_spark_sqltask will allow users to submit a query that will run a Spark. It will not necessarily run on native Spark kubernetes operator. It will be similar structure to current _hivetask or _prestotask.

Type of Plugin

Can you help us with the implementation?

welcome[bot] commented 3 years ago

Thank you for opening your first issue here! 🛠

akhurana001 commented 3 years ago

@kumare3 @EngHabu FYI on this. @yunhao-qing wants to explore adding a new spark_sql_task to Flyte. Do we have any good documentation on adding a new plugin which @yunhao-qing can follow ?

Also Let us know if there are any questions.

yunhao-qing commented 3 years ago

@kumare3 @EngHabu Could you please provide some suggestions if you have time? Thank you.

kumare3 commented 3 years ago

@yunhao-qing can you please join the slack channel and we can discuss. Join Here

yunhao-qing commented 3 years ago

@kumare3 After some Lyft internal discussion. We still want to implement spark_sql_task in flytekit and mozartSpark plugin in the Lyft backend, similar to mozartPresto/ mozartHive. spark_sql_task will take in query_text, spark config and some other args. We prefer this approach over using use_spark flag for hive_task and creating mozart_task.

We want to go with this approach because:

Let me know if you have any concern.

cc @akhurana001

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 1 year ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

kumare3 commented 9 months ago

There is a base PR in flytekit for this - https://github.com/flyteorg/flytekit/pull/1097. It needs more work. cc @pingsutw please feel free to hand over to anyone

github-actions[bot] commented 4 days ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏