astronomer / astro-provider-databricks

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows
Apache License 2.0
20 stars 10 forks source link

New Feature Request: Handling retry at the notebook level #65

Open RafaelCartenet opened 4 months ago

RafaelCartenet commented 4 months ago

Hey, been using the library for a while now, love it, thanks for the good work. Trying to implement retry at the notebook level, Databricks have parameters for it you can change in the UI that appear like this in the json:

{
  "task_key": "databricks_lol__champion_builds__champion_builds_gold_0",
  ...
  "max_retries": 2,
  "min_retry_interval_millis": 60000,
  "retry_on_timeout": true,
  "timeout_seconds": 1200,
}

I'm unable to set those parameters with astro_databricks

Thanks in advance

tatiana commented 2 months ago

Hi, @RafaelCartenet, this is a handy feature. We could also have this associated with the Airflow task retry, so Airflow could automatically retry tasks on failure. Would you be interested in contributing to the project?

RafaelCartenet commented 1 month ago

Hey Tatiana I could eventually, could you please let me know what it takes to contribute ? Also I'm curious about the current status of this project. It's really promising though it doesn't look like being updated really often. Is Astronomer still looking into maintaining it long term ?

Best

Rafael

On Thu, May 9, 2024 at 10:29 PM Tatiana Al-Chueyr @.***> wrote:

Hi, @RafaelCartenet https://github.com/RafaelCartenet, this is a handy feature. We could also have this associated with the Airflow task retry, so Airflow could automatically retry tasks on failure. Would you be interested in contributing to the project?

— Reply to this email directly, view it on GitHub https://github.com/astronomer/astro-provider-databricks/issues/65#issuecomment-2102777328, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXKHJRZYWF5MMDSFE26EHDZBOB6NAVCNFSM6AAAAABERXUJPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSG43TOMZSHA . You are receiving this because you were mentioned.Message ID: @.***>

tatiana commented 1 month ago

Hi @RafaelCartenet !

Those are great questions. Astronomer is contributing this provider to the Apache Airflow Databricks provider so it can be part of a larger community - both users and contributors.

These are the steps:

  1. Contribute DatabricksNotebookOperator to Airflow Databricks provider ✅ (https://github.com/apache/airflow/pull/39178)
  2. Contribute DatabricksWorkflowTaskGroup to Airflow Databricks provider 🚧
  3. Contribute DatabricksTaskOperator to Airflow Databricks provider
  4. Contribute the Astro Provider Databricks plugin to Airflow Databricks provider
  5. Add deferrable support for donated Databricks operators 🚧 (https://github.com/apache/airflow/pull/39295)
  6. Mark deprecation for astro-provider-databricks repo

I see the feature you requested being developed / contributed after the migration.