Closed thispejo closed 9 months ago
Hi @thispejo
I think the problem is related to the airflow container and it comes from this PR: https://github.com/bitnami/containers/pull/49544 Could you test setting web.image.tag=2.7.1-debian-11-r13
to ensure that I'm looking in the right direction?
In the meantime I'll try to gather more information about that change.
Hello, I did the test, but the error remained.
DESCRIBE
SaaS.Airflow> k describe pod/airflow-homo-web-7b56d95d5f-4wfmm
Name: airflow-homo-web-7b56d95d5f-4wfmm
Namespace: airflow
Priority: 0
Service Account: default
Node: 10.0.10.11/10.0.10.11
Start Time: Tue, 21 Nov 2023 11:22:13 -0300
Labels: app.kubernetes.io/component=web
app.kubernetes.io/instance=airflow-homo
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=airflow
app.kubernetes.io/version=2.7.1
helm.sh/chart=airflow-16.1.3
pod-template-hash=7b56d95d5f
Annotations: checksum/configmap: 19ba7505341d87f103d369d4b7fe80ebdb3930fac9f3a88aae13410d3a843f6e
Status: Running
IP: 10.244.1.151
IPs:
IP: 10.244.1.151
Controlled By: ReplicaSet/airflow-homo-web-7b56d95d5f
Containers:
airflow-web:
Container ID: cri-o://0bdecadf4b9df9cd89dfa67432d4e439278be7d95978e0d4b92a42f450cb88f0
Image: docker.io/bitnami/airflow:2.7.1-debian-11-r13
Image ID: docker.io/bitnami/airflow@sha256:d3c84e594f5c758d9861d278e41fb27a6b3904326b6081d732f6402865391c2f
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 21 Nov 2023 11:26:23 -0300
Finished: Tue, 21 Nov 2023 11:27:12 -0300
Ready: False
Restart Count: 3
Limits:
cpu: 1
memory: 2Gi
Requests:
cpu: 500m
memory: 500Mi
Liveness: tcp-socket :http delay=360s timeout=5s period=20s #success=1 #failure=6
Readiness: tcp-socket :http delay=360s timeout=5s period=10s #success=1 #failure=6
Environment:
AIRFLOW_FERNET_KEY: <set to the key 'airflow-fernet-key' in secret 'airflow-homo'> Optional: false
AIRFLOW_SECRET_KEY: <set to the key 'airflow-secret-key' in secret 'airflow-homo'> Optional: false
AIRFLOW_LOAD_EXAMPLES: no
BASH_DEBUG: 1
BITNAMI_DEBUG: true
AIRFLOW_DATABASE_NAME: bitnami_airflow
AIRFLOW_DATABASE_USERNAME: bn_airflow
AIRFLOW_DATABASE_PASSWORD: <set to the key 'password' in secret 'airflow-homo-postgresql'> Optional: false
AIRFLOW_DATABASE_HOST: airflow-homo-postgresql
AIRFLOW_DATABASE_PORT_NUMBER: 5432
REDIS_HOST: airflow-homo-redis-master
REDIS_PORT_NUMBER: 6379
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'airflow-homo-redis'> Optional: false
AIRFLOW__KUBERNETES__NAMESPACE: airflow
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY: docker.io/bitnami/airflow-worker
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG: 2.7.3-debian-11-r0
AIRFLOW__KUBERNETES__IMAGE_PULL_POLICY: IfNotPresent
AIRFLOW__KUBERNETES__DAGS_IN_IMAGE: True
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS: True
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS_ON_FAILURE: False
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME: default
AIRFLOW__KUBERNETES__POD_TEMPLATE_FILE: /opt/bitnami/airflow/pod_template.yaml
AIRFLOW_EXECUTOR: CeleryKubernetesExecutor
AIRFLOW_WEBSERVER_HOST: 0.0.0.0
AIRFLOW_WEBSERVER_PORT_NUMBER: 8080
AIRFLOW_USERNAME: *******admin
AIRFLOW_PASSWORD: <set to the key 'airflow-password' in secret 'airflow-homo'> Optional: false
AIRFLOW_BASE_URL: http://airflowhomo.local:8080
AIRFLOW_LDAP_ENABLE: no
AIRFLOW__SMTP__SMTP_HOST: smtp.gmail.com
AIRFLOW__SMTP__SMTP_STARTTLS: True
AIRFLOW__SMTP__SMTP_SSL: False
AIRFLOW__SMTP__SMTP_USER: noreply@*******.com.br
AIRFLOW__SMTP__SMTP_PORT: 587
AIRFLOW__SMTP__SMTP_PASSWORD: fadfaef*****
AIRFLOW__SMTP__SMTP_MAIL_FROM: noreply@*******.com.br
PYTHONPATH: /opt/bitnami/airflow/dags/git_devops/airflowdags/
AIRFLOW__KUBERNETES_EXECUTOR__LOGS_TASK_METADATA: True
AIRFLOW__KUBERNETES_EXECUTOR__DELETE_WORKER_PODS_ON_FAILURE: True
AIRFLOW__LOGGING__CELERY_LOGGING_LEVEL: INFO
AIRFLOW_KUBERNETES_LOGGING_ENABLED: True
Mounts:
/bitnami/python/requirements.txt from requirements (rw,path="requirements.txt")
/opt/bitnami/airflow/webserver_config.py from custom-webserver-configuration-file (rw,path="webserver_config.py")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-789vz (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
custom-webserver-configuration-file:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-web-config-16
Optional: false
requirements:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: requirements
Optional: false
kube-api-access-789vz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: name=Homologacao
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m10s default-scheduler Successfully assigned airflow/airflow-homo-web-7b56d95d5f-4wfmm to 10.0.10.11
Normal Pulling 4m9s kubelet Pulling image "docker.io/bitnami/airflow:2.7.1-debian-11-r13"
Normal Pulled 3m17s kubelet Successfully pulled image "docker.io/bitnami/airflow:2.7.1-debian-11-r13" in 51.354294023s (51.3543139s including waiting)
Normal Created 0s (x4 over 3m3s) kubelet Created container airflow-web
Normal Started 0s (x4 over 3m3s) kubelet Started container airflow-web
Normal Pulled 0s (x3 over 2m26s) kubelet Container image "docker.io/bitnami/airflow:2.7.1-debian-11-r13" already present on machine
Warning BackOff <invalid> (x7 over 119s) kubelet Back-off restarting failed container airflow-web in pod airflow-homo-web-7b56d95d5f-4wfmm_airflow(5fa13818-6a04-409c-ba07-cf7919227724)
===============================================================
LOGS:
Successfully installed apache-airflow-providers-apache-spark-4.4.0 authlib-1.2.1 discord-webhook-1.3.0 py4j-0.10.9.7 pyspark-3.5.0
[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
airflow 14:25:51.52 INFO ==> ** Starting Airflow setup **
airflow 14:25:51.57 INFO ==> Validating settings in POSTGRESQL_CLIENT_* env vars
airflow 14:25:51.60 INFO ==> Initializing Airflow ...
airflow 14:25:51.60 INFO ==> No injected configuration file found. Creating default config file
cp: cannot create regular file '/opt/bitnami/airflow/webserver_config.py': Read-only file system
Sorry @thispejo, you are completely right.
The error was introduced in the revision 2.7.1-debian-11-r5 (https://github.com/bitnami/containers/pull/48293), if you set web.image.tag=2.7.1-debian-11-r2
you shouldn't face the problem. I am gathering more information to fix it properly.
Sure thing, I'll be waiting for the fix. I'll let you know how it goes once I've given it a shot. 🤞👍
Hi @thispejo
New releases are in the oven to fix the problem in the container and use that new image tag in the chart. I hope to be ready in a few hours
Last Friday, version 16.1.6 was released using the airflow container tag 2.7.3-debian-11-r2 which solves the problem reported in this issue.
thanks!
The problem now is when I try to put a requirements.txt it gives this error:
Downloading Authlib-1.3.0-py2.py3-none-any.whl (223 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 223.7/223.7 kB 14.3 MB/s eta 0:00:00
Installing collected packages: Authlib
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/opt/bitnami/airflow/venv/lib/python3.11/site-packages/authlib'
yaml:
# Alias para a configuração de volumes e montagens de requirements
requirements-volume-config: &requirements-volume-config
extraVolumeMounts:
- name: requirements
mountPath: /bitnami/python/requirements.txt
subPath: requirements.txt
extraVolumes:
- name: requirements
configMap:
name: airflow-requirements
web:
automountServiceAccountToken: true
autoscaling:
enable: true
minReplicas: 1
maxReplicas: 3
#existingConfigmap: airflow-web-config
<<: *requirements-volume-config
scheduler:
replicaCount: 1
autoscaling:
enable: true
minReplicas: 1
maxReplicas: 5
targetCPU: 80
targetMemory: 80
nodeSelector:
name: AirflowHomo
automountServiceAccountToken: true
<<: *requirements-volume-config
worker:
replicaCount: 1
nodeSelector:
name: AirflowHomo
autoscaling:
enable: true
minReplicas: 1
maxReplicas: 5
targetCPU: 80
targetMemory: 80
readinessProbe:
enabled: true
initialDelaySeconds: 120
livenessProbe:
enabled: true
initialDelaySeconds: 120
automountServiceAccountToken: true
<<: *requirements-volume-config
Name and Version
bitnami/airflow/16.1.2
What architecture are you using?
None
What steps will reproduce the bug?
Apparently since a certain version of the chart started giving the error below:
In my production environment it is working as expected, I am using version --version 14.4.0
In the approval environment I tested the charts in the versions: 16.1.2, 16.1.1, 16.1.0, 16.0.7, they all generated the same error.
Are you using any custom parameters or values?
What is the expected behavior?
No response
What do you see instead?
Additional information
requirements.txt authlib discord-webhook apache-airflow-providers-apache-spark pytz