Open jiananyim opened 3 weeks ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Have you tried t separate the DAG parsing process out of the schedulers. If you run 5 scheduler instances it could be in deed multiple schedulers are parsing the same DAG files in paralllel. See https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#standalone-dag-processor
This also reduces load from schedulers, so running one standalone DAG processor also might give an option to reduce the amount of parallel schedulers. We have not seen more schedlung throughput in our setup if more than 3 schedulers are running.
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.8.1
What happened?
Hello,
We recently encountered the following error, and we can confirm that the DAG viewing_station_grid_data_analysis only appears once in our setup.
We have 5 schedulers running. For DAG generation, we have a dynamically generated DAG file that can produce over a hundred DAGs with fixed names.
It appears that these 5 schedulers are writing to the database simultaneously, causing contention. Therefore, I would like to understand the locking mechanism in Airflow and the writing mechanism for dynamic DAGs.
Thank you very much!
What you think should happen instead?
Airflow should only process it without the duplicate key problem
How to reproduce
dynamic dag generation with a large number of dags.
Operating System
amazon mwaa
Versions of Apache Airflow Providers
No response
Deployment
Amazon (AWS) MWAA
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct