astronomer / astro-provider-databricks

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows
Apache License 2.0
20 stars 10 forks source link

Support for pyspark job submit for Databricks Jobs using astro provider #54

Open neerajvash8 opened 12 months ago

neerajvash8 commented 12 months ago

Currently the DatabricksWorkflowTaskGroup only supports creating notebook tasks using the DatabricksNotebookOperator. While I was going through Orchestrate Databricks jobs with Apache Airflow, I came across DatabricksSubmitRunOperator. This is a really a nice functionality as it would allow users to take full advantage of DatabricksWorkflowTaskGroup from astro and ease the development of clean Airflow DAGs.

There are other ways to implement the above using Databricks Connect V2.

The ask is related to the discussion on How to Orchestrate Databricks Jobs Using Airflow where Daniel Imberman(@dimberman) expressed that this functionality is in the roadmap of this project.