Closed liangriyu closed 12 months ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
How to reproduce When multiple dag tasks fail simultaneously
I've unable to reproduce with this description and simple DAG
import pendulum
from airflow.decorators import task
from airflow.models.dag import DAG
from airflow.operators.empty import EmptyOperator
for ix in range(1, 4):
with DAG(
f"issue_35144_dag_{ix}",
start_date=pendulum.datetime(2023, 6, 1, tz="UTC"),
schedule="@daily",
catchup=True,
max_active_runs=16,
tags=["issue", "35144", f"no: {ix}"]
):
@task
def div(x):
return x / 0
div.expand(x=list(range(3))) >> EmptyOperator(task_id="empty", trigger_rule="all_done")
In my case (main branch, MySQL 8.0, ARM) it works without any deadlocks
Could you provide more details when it happen, some reproducible DAG example also would be nice, and what version of MySQL Database do you use?
had the exactly same problem , once the task instance increase ,the schelduler shut down.
once the task instance increase
You mean increased number of simultaneous Task Instances? If so, what numbers of TI are we talking about? 10-100-1000?
Any chance to get reproducible cases? Without it will be difficult to understand what exactly is the reason to that deadlocks.
Sorry I couldn't resist
X: Mom can we have a solution for that deadlock issues in Airflow on MySQL? Mom: We have solution at home
once the task instance increase
You mean increased number of simultaneous Task Instances? If so, what numbers of TI are we talking about? 10-100-1000? less than 100 Any chance to get reproducible cases? Without it will be difficult to understand what exactly is the reason to that deadlocks. i just installed airflow==2.6.3 and , less than 100 task start ,then scheduler shut sown with " Deadlock found when trying to get lock; try restarting transaction...."
the lock in db as follows: DELETE FROM rendered_task_instance_fields WHERE rendered_task_instance_fields.dag_id = 'ods_orderdbtotal_hourly' AND rendered_task_instance_fields.task_id = 'orderdbtotal_order_detail_promotion_tmp' AND ((rendered_task_instance_fields.dag_id, rendered_task_instance_fields.task_id, rendered_task_instance_fields.run_id) NOT IN (SELECT anon_1.dag_id, anon_1.task_id, anon_1.run_id FROM (SELECT DISTINCT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.run_id AS run_id, dag_run.execution_date AS execution_date FROM rendered_task_instance_fields INNER JOIN dag_run ON rendered_task_instance_fields.dag_id = dag_run.dag_id AND rendered_task_instance_fields.run_id = dag_run.run_id WHERE rendered_task_instance_fields.dag_id = 'ods_orderdbtotal_hourly' AND rendered_task_instance_fields.task_id = 'orderdbtotal_order_detail_promot
UPDATE dag_run SET last_scheduling_decision='2023-10-26 10:22:41.409798', updated_at='2023-10-26 10:22:41.599107' WHERE dag_run.id = 253
@langfu54 In your case you could try to set [core] max_num_rendered_ti_fields_per_task
to 0
. Set this value to 0
mean that worker/scheduler wouldn't try to cleanup rendered_task_instance_fields
table by this query.
This PR also potentially could things better https://github.com/apache/airflow/pull/33527 (should be part of Airflow 2.8) but no guarantee
appreciate your help , it looks that , airflow works correctly now .
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
This issue has been closed because it has not received response from the issue author.
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
What you think should happen instead
No response
How to reproduce
When multiple dag tasks fail simultaneously
Operating System
centos7
Versions of Apache Airflow Providers
apache-airflow 2.6.3 apache-airflow-providers-celery 3.2.1 apache-airflow-providers-common-sql 1.5.2 apache-airflow-providers-datadog 3.3.1 apache-airflow-providers-ftp 3.4.2 apache-airflow-providers-http 4.4.2 apache-airflow-providers-imap 3.2.2 apache-airflow-providers-mysql 5.1.1 apache-airflow-providers-redis 3.2.1 apache-airflow-providers-sqlite 3.4.2
Deployment
Virtualenv installation
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct