With the retry_from_failure=True flag set, each run only executes the models that failed in the previous run, which is fine. However, if there is an error in one model that can't be resolved (e.g., due to a data source issue), the flag prevents the other models from being refreshed, even in subsequent scheduled runs.
What you think should happen instead
I think retry_from_failure should only apply to reruns. Two improvements might be thought of.
add a condition that triggers {account_id}/jobs/{job_id}/rerun/ only during task reruns
e.g. replacing line 463 in providers/dbt/cloud/hooks/dbt.py
from if retry_from_failure: to something like if retry_from_failure and context['task_instance'].try_number!=1:
a more general solution replacing the flag with a parameter with several values e.g.
retry_from_failure = ["Never", "Rerun", "Always"]
How to reproduce
You don't need to do anything specific to reproduce the issue. The flag works on every run, but it should likely only affect reruns.
Apache Airflow Provider(s)
dbt-cloud
Versions of Apache Airflow Providers
Astronomer Runtime 12.1.0 based on Airflow 2.10.1+astro.1 Git Version: .release:7a1ffe6438b5ea8fcf75c4e5a356a6c23ab18404
Apache Airflow version
Airflow 2.10.1+astro.1
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Astronomer
Deployment details
pure:
https://github.com/dbt-labs/airflow-dbt-cloud
What happened
With the
retry_from_failure=True
flag set, each run only executes the models that failed in the previous run, which is fine. However, if there is an error in one model that can't be resolved (e.g., due to a data source issue), the flag prevents the other models from being refreshed, even in subsequent scheduled runs.What you think should happen instead
I think
retry_from_failure
should only apply to reruns. Two improvements might be thought of.add a condition that triggers
{account_id}/jobs/{job_id}/rerun/
only during task reruns e.g. replacing line 463 inproviders/dbt/cloud/hooks/dbt.py
fromif retry_from_failure:
to something likeif retry_from_failure and context['task_instance'].try_number!=1:
a more general solution replacing the flag with a parameter with several values e.g.
retry_from_failure = ["Never", "Rerun", "Always"]
How to reproduce
You don't need to do anything specific to reproduce the issue. The flag works on every run, but it should likely only affect reruns.
Anything else
No response
Are you willing to submit PR?
Code of Conduct