Open yunhao-qing opened 3 years ago
Thank you for opening your first issue here! 🛠
@kumare3 @EngHabu FYI on this. @yunhao-qing wants to explore adding a new spark_sql_task to Flyte. Do we have any good documentation on adding a new plugin which @yunhao-qing can follow ?
Also Let us know if there are any questions.
@kumare3 @EngHabu Could you please provide some suggestions if you have time? Thank you.
@yunhao-qing can you please join the slack channel and we can discuss. Join Here
@kumare3 After some Lyft internal discussion. We still want to implement spark_sql_task
in flytekit
and mozartSpark
plugin in the Lyft backend, similar to mozartPresto
/ mozartHive
. spark_sql_task
will take in query_text, spark config and some other args. We prefer this approach over using use_spark
flag for hive_task
and creating mozart_task
.
We want to go with this approach because:
Let me know if you have any concern.
cc @akhurana001
Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏
Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏
There is a base PR in flytekit for this - https://github.com/flyteorg/flytekit/pull/1097. It needs more work. cc @pingsutw please feel free to hand over to anyone
Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏
Why would this plugin be helpful to the Flyte community
_spark_sqltask will allow users to submit a query that will run a Spark. It will not necessarily run on native Spark kubernetes operator. It will be similar structure to current _hivetask or _prestotask.
Type of Plugin
Can you help us with the implementation?