Scheduler is getting a PermissionError when running the file process error with the following stack trace
Process DagFileProcessor41434-Process:
Traceback (most recent call last):
File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, self._kwargs)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 155, in _run_file_processor
result: Tuple[int, int] = dag_file_processor.process_file(
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
return func(*args, session=session, *kwargs)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 660, in process_file
dagbag.sync_to_db()
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
return func(args, session=session, kwargs)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 615, in sync_to_db
for attempt in run_with_db_retries(logger=self.log):
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 382, in iter
do = self.iter(retry_state=retry_state)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 349, in iter
return fut.result()
File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 629, in sync_to_db
DAG.bulk_write_to_db(self.dags.values(), session=session)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper
return func(*args, *kwargs)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dag.py", line 2470, in bulk_write_to_db
DagCode.bulk_sync_to_db(filelocs, session=session)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper
return func(args, **kwargs)
File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagcode.py", line 114, in bulk_sync_to_db
os.path.getmtime(correct_maybe_zipped(fileloc)), tz=timezone.utc
File "/opt/conda_envs/airflow/lib/python3.8/genericpath.py", line 55, in getmtime
return os.stat(filename).st_mtime
PermissionError: [Errno 13] Permission denied: '/opt/airflow/dags/airflow_dags//.py'
{manager.py:924} ERROR - Processor for /opt/airflow/dags/airflow_dags//.py exited with return code 1.
I performed a couple courses of action:
Confirmed that no deployments occurred that may have clobbered permissions for whatever error
Setup a CRON job to force the permissions incase I missed something
Changed the value file_parsing_sort_mode to random_seeded_by_host and alphabetical in an attempted to see if it could be bypassed by not looking at the modified date.
It occurs randomly and not every schedule loop, or even on a predictable loop. I almost wonder if a race condition is causing the issue within the scheduler. This happened after our update from 2.2.3 to 2.3.2.
Apache Airflow version
2.3.2 (latest released)
What happened
Scheduler is getting a PermissionError when running the file process error with the following stack trace
Process DagFileProcessor41434-Process: Traceback (most recent call last): File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/opt/conda_envs/airflow/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, self._kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 155, in _run_file_processor result: Tuple[int, int] = dag_file_processor.process_file( File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper return func(*args, session=session, *kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/dag_processing/processor.py", line 660, in process_file dagbag.sync_to_db() File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper return func(args, session=session, kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 615, in sync_to_db for attempt in run_with_db_retries(logger=self.log): File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 382, in iter do = self.iter(retry_state=retry_state) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/tenacity/init.py", line 349, in iter return fut.result() File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 437, in result return self.__get_result() File "/opt/conda_envs/airflow/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagbag.py", line 629, in sync_to_db DAG.bulk_write_to_db(self.dags.values(), session=session) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper return func(*args, *kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dag.py", line 2470, in bulk_write_to_db DagCode.bulk_sync_to_db(filelocs, session=session) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/utils/session.py", line 68, in wrapper return func(args, **kwargs) File "/opt/conda_envs/airflow/lib/python3.8/site-packages/airflow/models/dagcode.py", line 114, in bulk_sync_to_db os.path.getmtime(correct_maybe_zipped(fileloc)), tz=timezone.utc File "/opt/conda_envs/airflow/lib/python3.8/genericpath.py", line 55, in getmtime return os.stat(filename).st_mtime PermissionError: [Errno 13] Permission denied: '/opt/airflow/dags/airflow_dags//.py'
{manager.py:924} ERROR - Processor for /opt/airflow/dags/airflow_dags//.py exited with return code 1.
I performed a couple courses of action:
It occurs randomly and not every schedule loop, or even on a predictable loop. I almost wonder if a race condition is causing the issue within the scheduler. This happened after our update from 2.2.3 to 2.3.2.
What you think should happen instead
Not error out.
How to reproduce
Unable to consistently reproduce it.
Operating System
REHL
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.4.0 apache-airflow-providers-apache-hdfs==2.2.0 apache-airflow-providers-apache-hive==2.1.0 apache-airflow-providers-apache-spark==2.0.2 apache-airflow-providers-apache-sqoop==2.0.2 apache-airflow-providers-celery==2.1.0 apache-airflow-providers-ftp @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-ftp_1631176991628/work apache-airflow-providers-http @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-http_1630909395407/work apache-airflow-providers-imap @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-imap_1631176968327/work apache-airflow-providers-jenkins==2.0.3 apache-airflow-providers-jira==2.0.1 apache-airflow-providers-microsoft-azure==3.4.0 apache-airflow-providers-microsoft-mssql==2.0.1 apache-airflow-providers-mysql==2.1.1 apache-airflow-providers-odbc==2.0.1 apache-airflow-providers-oracle==2.0.1 apache-airflow-providers-papermill==2.1.0 apache-airflow-providers-postgres==2.4.0 apache-airflow-providers-samba==3.0.1 apache-airflow-providers-sqlite @ file:///home/conda/feedstock_root/build_artifacts/apache-airflow-providers-sqlite_1631202652057/work apache-airflow-providers-ssh==2.3.0 apache-airflow-providers-tableau==2.1.2
Deployment
Virtualenv installation
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct