apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.85k stars 14.25k forks source link

Scheduler gets permission error when running os.stat on dag file #24687

Closed rcwoolston closed 2 years ago

rcwoolston commented 2 years ago

Apache Airflow version

2.3.2 (latest released)

What happened

Scheduler is getting a PermissionError when running the file process error with the following stack trace

Process DagFileProcessor41434-Process: Traceback (most recent call last): File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, self._kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 155, in _run_file_processor result: Tuple[int, int] = dag_file_processor.process_file( File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper return func(*args, session=session, *kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 660, in process_file dagbag.sync_to_db() File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper return func(args, session=session, kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 615, in sync_to_db for attempt in run_with_db_retries(logger=self.log): File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 382, in iter do = self.iter(retry_state=retry_state) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 349, in iter return fut.result() File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 437, in result return self.__get_result() File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 629, in sync_to_db DAG.bulk_write_to_db(self.dags.values(), session=session) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper return func(*args, *kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dag.py", line 2470, in bulk_write_to_db DagCode.bulk_sync_to_db(filelocs, session=session) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper return func(args, **kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagcode.py", line 114, in bulk_sync_to_db os.path.getmtime(correct_maybe_zipped(fileloc)), tz=timezone.utc File "/opt/conda_envs/airflow/lib/python3.8/genericpath.py", line 55, in getmtime return os.stat(filename).st_mtime PermissionError: [Errno 13] Permission denied: '/opt/airflow/dags/airflow_dags//.py' {manager.py:924} ERROR - Processor for /opt/airflow/dags/airflow_dags//.py exited with return code 1.

I performed a couple courses of action:

It occurs randomly and not every schedule loop, or even on a predictable loop. I almost wonder if a race condition is causing the issue within the scheduler. This happened after our update from 2.2.3 to 2.3.2.

What you think should happen instead

Not error out.

How to reproduce

Unable to consistently reproduce it.

Operating System

REHL

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==2.4.0 apache-airflow-providers-apache-hdfs==2.2.0 apache-airflow-providers-apache-hive==2.1.0 apache-airflow-providers-apache-spark==2.0.2 apache-airflow-providers-apache-sqoop==2.0.2 apache-airflow-providers-celery==2.1.0 apache-airflow-providers-ftp @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-ftp_1631176991628/work apache-airflow-providers-http @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-http_1630909395407/work apache-airflow-providers-imap @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-imap_1631176968327/work apache-airflow-providers-jenkins==2.0.3 apache-airflow-providers-jira==2.0.1 apache-airflow-providers-microsoft-azure==3.4.0 apache-airflow-providers-microsoft-mssql==2.0.1 apache-airflow-providers-mysql==2.1.1 apache-airflow-providers-odbc==2.0.1 apache-airflow-providers-oracle==2.0.1 apache-airflow-providers-papermill==2.1.0 apache-airflow-providers-postgres==2.4.0 apache-airflow-providers-samba==3.0.1 apache-airflow-providers-sqlite @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-sqlite_1631202652057/work apache-airflow-providers-ssh==2.3.0 apache-airflow-providers-tableau==2.1.2

Deployment

Virtualenv installation

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 2 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!