airflow-helm / charts

The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
https://github.com/airflow-helm/charts/tree/main/charts/airflow
Apache License 2.0
647 stars 475 forks source link

Permission denied to Airflow logs directory #708

Closed cbuffett closed 1 year ago

cbuffett commented 1 year ago

Checks

Chart Version

8.X.X

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:57:26Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-25T19:35:11Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

Helm Version

version.BuildInfo{Version:"v3.11.1", GitCommit:"293b50c65d4d56187cd4e2f390f0ada46b4c4737", GitTreeState:"clean", GoVersion:"go1.18.10"}

Description

I'm attempting to enable persistent logging on my local kind cluster, but my containers fail to start due to the error log below. I've manually configured my PV and PVC and reference the claim in my Helm config. I've attempted the workarounds detailed at https://stackoverflow.com/questions/63510335/airflow-on-kubernetes-errno-13-permission-denied-opt-airflow-logs-schedule, but from what I can tell, the check-db init container is executed prior to the extraInitContainer, resulting in the startup failure.

Logs PV

kind: PersistentVolume
metadata:
  name: airflow-logs
  labels:
    app: airflow
spec:
  storageClassName: standard

  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: airflow-logs
    namespace: airflow
  hostPath:
    path: "/mnt/airflow/logs"
kubectl describe pv airflow-logs
Name:            airflow-logs
Labels:          app=airflow
Annotations:     <none>
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    standard
Status:          Bound
Claim:           airflow/airflow-logs
Reclaim Policy:  Retain
Access Modes:    RWX
VolumeMode:      Filesystem
Capacity:        5Gi
Node Affinity:   <none>
Message:
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /mnt/airflow/logs
    HostPathType:
Events:            <none>

Logs PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: airflow-logs
  namespace: airflow
  labels:
    app: airflow
spec:
  storageClassName: standard
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
kubectl describe pvc airflow-logs -n airflow
Name:          airflow-logs
Namespace:     airflow
StorageClass:  standard
Status:        Bound
Volume:        airflow-logs
Labels:        app=airflow
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Used By:       airflow-db-migrations-6496798bbb-6hp7d
               airflow-scheduler-5f968686d4-g7p9z
               airflow-scheduler-7776c678c9-r2frt
               airflow-sync-connections-7759755cdc-rfl8t
               airflow-sync-pools-d9fbbff8b-p5kqd
               airflow-sync-variables-6548568bf5-w7vjz
               airflow-web-6dd9d8cf84-7qblq
Events:        <none>

Relevant Logs

kubectl logs airflow-scheduler-5f968686d4-g7p9z -n airflow -c check-db
Unable to load the config, contains a configuration error.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/logging/config.py", line 563, in configure
    handler = self.configure_handler(handlers[name])
  File "/usr/local/lib/python3.7/logging/config.py", line 736, in configure_handler
    result = factory(**kwargs)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/log/file_processor_handler.py", line 46, in __init__
    Path(self._get_log_directory()).mkdir(parents=True, exist_ok=True)
  File "/usr/local/lib/python3.7/pathlib.py", line 1273, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/opt/airflow/logs/scheduler/2023-03-16'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/__init__.py", line 46, in <module>
    settings.initialize()
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/settings.py", line 446, in initialize
    LOGGING_CLASS_PATH = configure_logging()
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/logging_config.py", line 73, in configure_logging
    raise e
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/logging_config.py", line 68, in configure_logging
    dictConfig(logging_config)
  File "/usr/local/lib/python3.7/logging/config.py", line 800, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/local/lib/python3.7/logging/config.py", line 571, in configure
    '%r' % name) from e
ValueError: Unable to configure handler 'processor'

Custom Helm Values

########################################
## CONFIG | Airflow Configs
########################################
airflow:
  ## if we use legacy 1.10 airflow commands
  legacyCommands: false

  ## configs for the airflow container image
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/configuration/airflow-version.md
  image:
    imagePullPolicy: IfNotPresent
    repository: <custom_airflow_2.1.0_image>

  ## the airflow executor type to use
  executor: KubernetesExecutor

  ## the fernet encryption key (sets `AIRFLOW__CORE__FERNET_KEY`)
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/security/set-fernet-key.md
  ## [WARNING] change from default value to ensure security
  fernetKey: ""

  ## the secret_key for flask (sets `AIRFLOW__WEBSERVER__SECRET_KEY`)
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/security/set-webserver-secret-key.md
  ## [WARNING] change from default value to ensure security
  webserverSecretKey: ""

  ## environment variables for airflow configs
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/configuration/airflow-configs.md
  config:
    AIRFLOW__WEBSERVER__BASE_URL: "http://localhost"
    AIRFLOW__LOGGING__LOGGING_LEVEL: "DEBUG"

###################################
## COMPONENT | Airflow Scheduler
###################################
scheduler:
  ## the number of scheduler Pods to run
  replicas: 1

  ## resource requests/limits for the scheduler Pods
  ## [SPEC] https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#resourcerequirements-v1-core
  resources: {}

  ## configs for the log-cleanup sidecar of the scheduler
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/monitoring/log-cleanup.md
  logCleanup:
    enabled: false
    retentionMinutes: 21600

  ## configs for the scheduler Pods' liveness probe
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/monitoring/scheduler-liveness-probe.md
  livenessProbe:
    enabled: true

    ## configs for an additional check that ensures tasks are being created by the scheduler
    ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/monitoring/scheduler-liveness-probe.md
    taskCreationCheck:
      enabled: false
      thresholdSeconds: 300
      schedulerAgeBeforeCheck: 180

  extraInitContainers:
  - name: fix-volume-logs-permissions
    image: busybox
    command: [ "sh", "-c", "chown -R 50000:0 /opt/airflow/logs/" ]
    # securityContext:
    #   runAsUser: 0
    volumeMounts:
      - mountPath: /opt/airflow/logs/
        name: logs-data

###################################
## CONFIG | Airflow Logs
###################################
logs:

  ## the airflow logs folder
  path: /opt/airflow/logs

  ## configs for the logs PVC
  ## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/monitoring/log-persistence.md
  persistence:
    enabled: true
    existingClaim: "airflow-logs"
    storageClassName: "standard"
    size: 5Gi
cbuffett commented 1 year ago

I've done some more debugging and testing and it looks like the issue was due to permissions of the logs directory on my Ubuntu machine (parent folder mounted into the kind cluster under /mnt/airflow). After recreating the logs directory, I still had permission issues, but was able to resolve it by granting write permission to everyone using chmod a+w. What I noticed is that the logs directory on my Ubuntu machine was created when I configured my PV/PVC, but was owned by root:root. I first changed the ownership to my user account before then changing the write permissions.

thesuperzapper commented 1 year ago

@cbuffett it sounds like you have resolved your issue, please reopen if you have further problems.