apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.49k stars 14.13k forks source link

Getting error when `retries` is set to None at task level #42273

Open Locustv2 opened 1 week ago

Locustv2 commented 1 week ago

Apache Airflow version

2.10.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

I have a Dag with configuration retries = 10. One of my custom component (task) ideally should use the Dag configuration for retries. But in some cases, this value should be different. So the component had default value of None

When the default value is used and i try to clear task from the UI, i get an error:

webserver  | [2024-09-14T18:34:43.338+0000] {app.py:1744} ERROR - Exception on /clear [POST]
webserver  | Traceback (most recent call last):
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 2529, in wsgi_app
webserver  | response = self.full_dispatch_request()
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1825, in full_dispatch_request
webserver  | rv = self.handle_user_exception(e)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1823, in full_dispatch_request
webserver  | rv = self.dispatch_request()
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1799, in dispatch_request
webserver  | return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/auth.py", line 250, in decorated
webserver  | return _has_access(
webserver  | ^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/auth.py", line 163, in _has_access
webserver  | return func(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/decorators.py", line 159, in wrapper
webserver  | return f(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 97, in wrapper
webserver  | return func(*args, session=session, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/views.py", line 2467, in clear
webserver  | response = self._clear_dag_tis(
webserver  | ^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/views.py", line 2343, in _clear_dag_tis
webserver  | count = dag.clear(
webserver  | ^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 94, in wrapper
webserver  | return func(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/dag.py", line 2496, in clear
webserver  | clear_task_instances(
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 472, in clear_task_instances
webserver  | ti.max_tries = ti.try_number + task.retries
webserver  | ~~~~~~~~~~~~~~^~~~~~~~~~~~~~
webserver  | TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
webserver  | [2024-09-14T18:34:43.338+0000] {app.py:1744} ERROR - Exception on /clear [POST]
webserver  | Traceback (most recent call last):
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 2529, in wsgi_app
webserver  | response = self.full_dispatch_request()
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1825, in full_dispatch_request
webserver  | rv = self.handle_user_exception(e)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1823, in full_dispatch_request
webserver  | rv = self.dispatch_request()
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/flask/app.py", line 1799, in dispatch_request
webserver  | return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/auth.py", line 250, in decorated
webserver  | return _has_access(
webserver  | ^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/auth.py", line 163, in _has_access
webserver  | return func(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/decorators.py", line 159, in wrapper
webserver  | return f(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 97, in wrapper
webserver  | return func(*args, session=session, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/views.py", line 2467, in clear
webserver  | response = self._clear_dag_tis(
webserver  | ^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/www/views.py", line 2343, in _clear_dag_tis
webserver  | count = dag.clear(
webserver  | ^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py", line 94, in wrapper
webserver  | return func(*args, **kwargs)
webserver  | ^^^^^^^^^^^^^^^^^^^^^
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/dag.py", line 2496, in clear
webserver  | clear_task_instances(
webserver  | File "/home/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 472, in clear_task_instances
webserver  | ti.max_tries = ti.try_number + task.retries
webserver  | ~~~~~~~~~~~~~~^~~~~~~~~~~~~~
webserver  | TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

What you think should happen instead?

Ideally if None is used at Task level, the default Dag configuration should be used and when a value is provided, it overrides the Dag config.

How to reproduce

Operating System

MacOS Sonoma

Versions of Apache Airflow Providers

No response

Deployment

Other

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

Code of Conduct

jscheffl commented 1 week ago

Reference to Slack discussion https://apache-airflow.slack.com/archives/CCR6P6JRL/p1726478413720159 Thanks for posting.

Just to fully understand this: Is this applying to Mapped Tasks or "Normal" Tasks in a DAG? Does "default DAG configuration" refer to with DAG(... default_ags={"retries": None}...): or which defaults are you referring to? Can you maybe post a snipped of your DAG to ensure I understand this right?

Note for fixing: I assume we need to strengthen the DAG validation during parsing to ensure retries=None is not parseable. Seems this is not properly validated.

PApostol commented 1 week ago

I was facing the same issue, and the solution was to use retries=0 instead of retries=None. Because when Airflow tries to execute ti.max_tries = ti.try_number + task.retries, the latter is None, hence the error. Now, whether that is by design or by accident, I'm not sure!

sonu4578 commented 1 week ago

@jscheffl Hi Jens, could you please assign the issue to me? I will implement the required changes to improve the DAG validation during parsing and ensure that retries=None is not parseable.

sonu4578 commented 1 day ago

Created a draft PR: https://github.com/apache/airflow/pull/42532

I will verify the new validation checks and send the PR for review