PowerDataHub / terraform-aws-airflow

Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor
Apache License 2.0
84 stars 40 forks source link

Scheduler number of runs #35

Closed jozsi closed 4 years ago

jozsi commented 4 years ago

Hi,

I noticed that the scheduler has number of runs set to 10.

What was the reason for this? To pick up new DAG files? Cause for that, as far as I understood, dag_dir_list_interval does that automatically every 300 seconds by default.

edbizarro commented 4 years ago

This flag makes the scheduler process restart after n runs to avoid weird behaviors

jozsi commented 4 years ago

What are those weird behaviors? Do they still happen with the latest version of Airflow?

edbizarro commented 4 years ago

@jozsi scheduler hanging and not scheduling new tasks, but I don't know If this still happen with the current version

edbizarro commented 4 years ago

Ref: https://issues.apache.org/jira/plugins/servlet/mobile#issue/AIRFLOW-401

jozsi commented 4 years ago

Ouch, I see a recent comment reporting that the bug still exists in v1.10.4 (dated august 2019; latest is v1.10.7), 3+ years since it has been reported. Hence the workaround is still valid, the only minor drawback could be that it fills up the logs.