The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
We used to send logs directly to S3. However, now we've decided to send them to the standard directory /opt/airflow/logs by mounting our PVC, which in turn points to an S3 bucket like it mentioned here. Since logs from various DAGs are often not written, we decided that addressing log loss might be resolved by changing and testing a different logging algorithm described here.
After applying these changes, our airflow-db-migrations pod starts, and within it, there is a check-db container that fails with a PermissionDenied error for /opt/airflow/logs/scheduler.
I'm attaching the complete log below.
If we revert everything back to how it was before and connect as described here S3 Bucket, everything starts correctly, and all pods come up without errors.
I'm providing all our configurations for PVC/PV and Airflow below.
Please, can you advise where the error might be? Thank you in advance!
Unable to load the config, contains a configuration error.Traceback (most recent call last): File "/usr/local/lib/python3.8/pathlib.py", line 1288, in mkdir self._accessor.mkdir(self, mode)FileNotFoundError: [Errno 2] No such file or directory: '/opt/airflow/logs/scheduler/2024-02-01'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.8/logging/config.py", line 563, in configure handler = self.configure_handler(handlers[name]) File "/usr/local/lib/python3.8/logging/config.py", line 744, in configure_handler result = factory(**kwargs) File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/log/file_processor_handler.py", line 49, in __init__ Path(self._get_log_directory()).mkdir(parents=True, exist_ok=True) File "/usr/local/lib/python3.8/pathlib.py", line 1292, in mkdir self.parent.mkdir(parents=True, exist_ok=True) File "/usr/local/lib/python3.8/pathlib.py", line 1288, in mkdir self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/opt/airflow/logs/scheduler'The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/airflow/.local/bin/airflow", line 5, in <module> from airflow.__main__ import main File "/home/airflow/.local/lib/python3.8/site-packages/airflow/__init__.py", line 64, in <module> settings.initialize()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/settings.py", line 570, in initialize LOGGING_CLASS_PATH = configure_logging()
Checks
User-Community Airflow Helm Chart
.Chart Version
8.7.1
Kubernetes Version
Helm Version
Description
We used to send logs directly to S3. However, now we've decided to send them to the standard directory /opt/airflow/logs by mounting our PVC, which in turn points to an S3 bucket like it mentioned here. Since logs from various DAGs are often not written, we decided that addressing log loss might be resolved by changing and testing a different logging algorithm described here.
After applying these changes, our airflow-db-migrations pod starts, and within it, there is a check-db container that fails with a PermissionDenied error for /opt/airflow/logs/scheduler.
I'm attaching the complete log below.
If we revert everything back to how it was before and connect as described here S3 Bucket, everything starts correctly, and all pods come up without errors.
I'm providing all our configurations for PVC/PV and Airflow below.
Please, can you advise where the error might be? Thank you in advance!
PVC
PV
Relevant Logs
Custom Helm Values