airflow-helm / charts

The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm. Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.
https://github.com/airflow-helm/charts/tree/main/charts/airflow
Apache License 2.0
631 stars 473 forks source link

Job failed when mount logs volume - PermissionError: [Errno 1] Operation not permitted #713

Closed mcabrito closed 1 year ago

mcabrito commented 1 year ago

Checks

Chart Version

8.6.1

Kubernetes Version

Client Version: v1.25.4
Kustomize Version: v4.5.7
Server Version: v1.22.15

Helm Version

version.BuildInfo{Version:"v3.10.1", GitCommit:"9f88ccb6aee40b9a0535fcc7efea6055e1ef72c9", GitTreeState:"clean", GoVersion:"go1.18.7"}

Description

Hello guys. We have an Airflow running on an AKS, and it's mounting a ReadWriteMany volume on the log part. However, the jobs are failing, specifically on the logs.

Relevant Logs

/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/cli_parser.py:905 DeprecationWarning: The namespace option in [kubernetes] has been moved to the namespace option in [kubernetes_executor] - the old setting has been used, but please update your config.
/home/airflow/.local/lib/python3.10/site-packages/airflow/models/base.py:49 MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
[2023-04-05 18:42:42,192] {dagbag.py:532} INFO - Filling up the DagBag from /opt/***/dags/repo/aplication.py
Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/__main__.py", line 48, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/cli_parser.py", line 52, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/cli.py", line 108, in wrapper
    return f(*args, **kwargs
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/cli/commands/task_command.py", line 384, in task_run
    ti.init_run_context(raw=args.raw)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 2421, in init_run_context
    self._set_context(self)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/logging_mixin.py", line 77, in _set_context
    set_context(self.log, context)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/logging_mixin.py", line 213, in set_context
    flag = cast(FileTaskHandler, handler).set_context(value)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 71, in set_context
    local_loc = self._init_file(ti)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 382, in _init_file
    self._prepare_log_folder(Path(full_path).parent)
  File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/log/file_task_handler.py", line 358, in _prepare_log_folder
    directory.chmod(mode)
  File "/usr/local/lib/python3.10/pathlib.py", line 1191, in chmod
    self._accessor.chmod(self, mode, follow_symlinks=follow_symlinks)
PermissionError: [Errno 1] Operation not permitted: '/opt/airflow/logs/dag_id=aplication/run_id=scheduled__2023-04-04T18:00:00+00:00/task_id=aplication_core'

Custom Helm Values

airflow:
  image:
    tag: 2.5.2-python3.10
  executor: KubernetesExecutor
  config:
    AIRFLOW__WEBSERVER__BASE_URL: "https://airflow.com"
    # email configs
    AIRFLOW__EMAIL__EMAIL_BACKEND: "airflow.utils.email.send_email_smtp"
    AIRFLOW__SMTP__SMTP_HOST: "smtp.gmail.com"
    AIRFLOW__SMTP__SMTP_MAIL_FROM: "semail@email"
    AIRFLOW__SMTP__SMTP_USER: "semail@email"
    AIRFLOW__SMTP__SMTP_PASSWORD: "passSmtpAir"
    AIRFLOW__SMTP__SMTP_PORT: "587"
    AIRFLOW__SMTP__SMTP_SSL: "False"
    AIRFLOW__SMTP__SMTP_STARTTLS: "True"
  defaultSecurityContext:
    fsGroup: 0
workers:
  enabled: false
flower:
  enabled: false
scheduler:
  logCleanup:
    enabled: false
logs:
  path: /opt/airflow/logs
  persistence:
    enabled: true
    storageClass: "namestorageclass-filesystem"
    accessMode: ReadWriteMany
    size: 50Gi
dags:
  path: /opt/airflow/dags
  gitSync:
    enabled: true
    image:
      repository: k8s.gcr.io/git-sync/git-sync
      tag: v3.2.2
      pullPolicy: IfNotPresent
      uid: 65533
      gid: 65533
    repo: "https://github.com/teste/airflow-dags"
    branch: "main"
    revision: HEAD
    depth: 1
    syncWait: 60
    syncTimeout: 120
    httpSecret: "airflow-http-git-secret"
    httpSecretUsernameKey: username
    httpSecretPasswordKey: password
ingress:
  enabled: true
  apiVersion: networking.k8s.io/v1
  web:
    annotations:
      kubernetes.io/ingress.class: "azure/application-gateway"
      appgw.ingress.kubernetes.io/use-private-ip: "true"
      appgw.ingress.kubernetes.io/ssl-redirect: "true"
    labels: {}
    path: ""
    host: "airflow.com"
    tls:
      enabled: true
      secretName: "cert"
    precedingPaths: []
    succeedingPaths: []
redis:
  enabled: false
postgresql:
 persistence:
    storageClass: "namestorageclass-disk"
thesuperzapper commented 1 year ago

@mcabrito this failure is caused by a bug in airflow itself that was introduced in version 2.5.1 (and is still present in 2.5.2, and 2.5.3), see here https://github.com/apache/airflow/issues/29112. The problem is that Azure File does not allow chmod, and airflow mistakenly tries to run chmod on log files if they are not already 777.

For now, you might want to do one of the following:

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in 60 days. It will be closed in 7 days if no further activity occurs.

Thank you for your contributions.


Issues never become stale if any of the following is true:

  1. they are added to a Project
  2. they are added to a Milestone
  3. they have the lifecycle/frozen label
thesuperzapper commented 1 year ago

Closing because this is an upstream issue, your best bet is to NOT use Airflow 2.5.1, 2.5.2, or 2.5.3 (which all have this bug).