bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.88k stars 9.16k forks source link

[bitnami/airflow] Time to load each page / css / js is EXTREMLY slow #23421

Closed fzhan closed 6 months ago

fzhan commented 7 months ago

Name and Version

bitnami/airflow-16.5.3

What architecture are you using?

None

What steps will reproduce the bug?

  1. launch chart with the custom value
  2. visit the link
  3. each page / css / js takes more than 5s to load
image

which is not the when running from docker:

image

Are you using any custom parameters or values?

Using loadBalancer (metallb) / ingress. Every other services / apps loads fine, behind load balancer and SSL port 443 as well.

values.yaml:

# Copyright VMware, Inc.
# SPDX-License-Identifier: APACHE-2.0
global:
  imageRegistry: ""
  imagePullSecrets: []
  storageClass: ""
kubeVersion: ""
nameOverride: ""
fullnameOverride: ""
clusterDomain: cluster.local
extraDeploy: 
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[cncf.kubernetes]
        connexion[swagger-ui]
        apache-airflow[amazon]
        airflow-code-editor
        black
        fs-s3fs
        fs-gcsfs
commonLabels: {}
commonAnnotations: {}
diagnosticMode:
  enabled: false
  command:
    - sleep
  args:
    - infinity
auth:
  username: "airflow"
  password: "airflow"
  fernetKey: "airflow="
  secretKey: "airflow="
  existingSecret: ""
executor: CeleryExecutor
loadExamples: false
existingConfigmap: ""
dags:
  existingConfigmap: ""
  image:
    registry: docker.io
    repository: bitnami/os-shell
    tag: 11-debian-11-r96
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
extraEnvVars: 
  - name: "AIRFLOW__WEBSERVER__WORKER_CLASS"
    value: "gevent"
  - name: "AIRFLOW__CODE_EDITOR__ENABLED"
    value: "True"
  - name: "AIRFLOW__CODE_EDITOR__ROOT_DIRECTORY"
    value: "/opt/bitnami/airflow/dags"
  - name: "AIRFLOW__CODE_EDITOR__STRING_NORMALIZATION"
    value: "True"
  - name: "AIRFLOW__CODE_EDITOR__MOUNT"
    value: "name=logs,path=/opt/bitnami/airflow/logs"
  - name: "_AIRFLOW_PATCH_GEVENT"
    value: "1"
extraEnvVarsCM: ""
extraEnvVarsSecret: ""
extraEnvVarsSecrets: []
sidecars: []
initContainers: []
extraVolumeMounts: 
  - name: airflow-dag
    mountPath: /opt/bitnami/airflow/dags
  - name: requirements
    mountPath: /bitnami/python/requirements.txt
    subPath: requirements.txt
extraVolumes: 
  - name: airflow-dag
    persistentVolumeClaim:
      claimName: airflow-dags
  - name: requirements
    configMap:
      name: airflow-requirements
web:
  image:
    registry: docker.io
    repository: bitnami/airflow
    tag: 2.8.1-debian-11-r3
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
    debug: true
  baseUrl: "airflow.local"
  existingConfigmap: ""
  command: []
  args: []
  extraEnvVars: []
  extraEnvVarsCM: ""
  extraEnvVarsSecret: ""
  extraEnvVarsSecrets: []
  containerPorts:
    http: 8080
  replicaCount: 1
  livenessProbe:
    enabled: true
    initialDelaySeconds: 180
    periodSeconds: 20
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  readinessProbe:
    enabled: true
    initialDelaySeconds: 30
    periodSeconds: 10
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  startupProbe:
    enabled: false
    initialDelaySeconds: 60
    periodSeconds: 10
    timeoutSeconds: 1
    failureThreshold: 15
    successThreshold: 1
  customLivenessProbe: {}
  customReadinessProbe: {}
  customStartupProbe: {}
  resources:
    limits: {}
    requests: {}
  podSecurityContext:
    enabled: true
    fsGroupChangePolicy: Always
    sysctls: []
    supplementalGroups: []
    fsGroup: 1001
  containerSecurityContext:
    enabled: true
    seLinuxOptions: null
    runAsUser: 1001
    runAsNonRoot: true
    privileged: false
    allowPrivilegeEscalation: false
    capabilities:
      drop: ["ALL"]
    seccompProfile:
      type: "RuntimeDefault"
  lifecycleHooks: {}
  automountServiceAccountToken: false
  hostAliases: []
  podLabels: {}
  podAnnotations: {}
  affinity: {}
  nodeAffinityPreset:
    key: ""
    type: ""
    values: []
  nodeSelector: {}
  podAffinityPreset: ""
  podAntiAffinityPreset: soft
  tolerations: []
  topologySpreadConstraints: []
  priorityClassName: ""
  schedulerName: ""
  terminationGracePeriodSeconds: ""
  updateStrategy:
    type: RollingUpdate
    rollingUpdate: {}
  sidecars: []
  initContainers: []
  extraVolumeMounts: []
  extraVolumes: []
  pdb:
    create: false
    minAvailable: 1
    maxUnavailable: ""
  networkPolicy:
    enabled: true
    allowExternal: true
    extraIngress: []
    extraEgress: 
      - ports:
          - port: 80
          - port: 443
          - port: 5432
          - port: 6379
          - port: 8080
          - port: 8793
          - port: 3307
          - port: 3306
    ingressNSMatchLabels: {}
    ingressNSPodMatchLabels: {}
scheduler:
  image:
    registry: docker.io
    repository: bitnami/airflow-scheduler
    tag: 2.8.1-debian-11-r3
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
    debug: true
  replicaCount: 3
  command: []
  args: []
  extraEnvVars: []
  extraEnvVarsCM: ""
  extraEnvVarsSecret: ""
  extraEnvVarsSecrets: []
  livenessProbe:
    enabled: true
    initialDelaySeconds: 180
    periodSeconds: 20
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  readinessProbe:
    enabled: true
    initialDelaySeconds: 30
    periodSeconds: 10
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  customLivenessProbe: {}
  customReadinessProbe: {}
  customStartupProbe: {}
  resources:
    limits: {}
    requests: {}
  podSecurityContext:
    enabled: true
    fsGroupChangePolicy: Always
    sysctls: []
    supplementalGroups: []
    fsGroup: 1001
  containerSecurityContext:
    enabled: true
    seLinuxOptions: null
    runAsUser: 1001
    runAsNonRoot: true
    privileged: false
    allowPrivilegeEscalation: false
    capabilities:
      drop: ["ALL"]
    seccompProfile:
      type: "RuntimeDefault"
  lifecycleHooks: {}
  automountServiceAccountToken: false
  hostAliases: []
  podLabels: {}
  podAnnotations: {}
  affinity: {}
  nodeAffinityPreset:
    key: ""
    type: ""
    values: []
  nodeSelector: {}
  podAffinityPreset: ""
  podAntiAffinityPreset: soft
  tolerations: []
  topologySpreadConstraints: []
  priorityClassName: ""
  schedulerName: ""
  terminationGracePeriodSeconds: ""
  updateStrategy:
    type: RollingUpdate
    rollingUpdate: {}
  sidecars: []
  initContainers: []
  extraVolumeMounts: []
  extraVolumes: []
  pdb:
    create: false
    minAvailable: 1
    maxUnavailable: ""
  networkPolicy:
    enabled: true
    allowExternal: true
    extraIngress: []
    extraEgress: 
      - ports:
          - port: 80
          - port: 443
          - port: 5432
          - port: 6379
          - port: 8080
          - port: 8793
          - port: 3307
          - port: 3306
    ingressNSMatchLabels: {}
    ingressNSPodMatchLabels: {}
worker:
  image:
    registry: docker.io
    repository: bitnami/airflow-worker
    tag: 2.8.1-debian-11-r3
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
    debug: true
  command: []
  args: []
  extraEnvVars: []
  extraEnvVarsCM: ""
  extraEnvVarsSecret: ""
  extraEnvVarsSecrets: []
  containerPorts:
    http: 8793
  replicaCount: 3
  livenessProbe:
    enabled: true
    initialDelaySeconds: 180
    periodSeconds: 20
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  readinessProbe:
    enabled: true
    initialDelaySeconds: 30
    periodSeconds: 10
    timeoutSeconds: 5
    failureThreshold: 6
    successThreshold: 1
  startupProbe:
    enabled: false
    initialDelaySeconds: 60
    periodSeconds: 10
    timeoutSeconds: 1
    failureThreshold: 15
    successThreshold: 1
  customLivenessProbe: {}
  customReadinessProbe: {}
  customStartupProbe: {}
  resources:
    limits: {}
    requests: {}
  podSecurityContext:
    enabled: true
    fsGroupChangePolicy: Always
    sysctls: []
    supplementalGroups: []
    fsGroup: 1001
  containerSecurityContext:
    enabled: true
    seLinuxOptions: null
    runAsUser: 1001
    runAsNonRoot: true
    privileged: false
    allowPrivilegeEscalation: false
    capabilities:
      drop: ["ALL"]
    seccompProfile:
      type: "RuntimeDefault"
  lifecycleHooks: {}
  automountServiceAccountToken: false
  hostAliases: []
  podLabels: {}
  podAnnotations: {}
  affinity: {}
  nodeAffinityPreset:
    key: ""
    type: ""
    values: []
  nodeSelector: {}
  podAffinityPreset: ""
  podAntiAffinityPreset: soft
  tolerations: []
  topologySpreadConstraints: []
  priorityClassName: ""
  schedulerName: ""
  terminationGracePeriodSeconds: ""
  updateStrategy:
    type: RollingUpdate
    rollingUpdate: {}
  sidecars: []
  initContainers: []
  extraVolumeMounts: []
  extraVolumes: []
  extraVolumeClaimTemplates: []
  podTemplate: {}
  pdb:
    create: false
    minAvailable: 1
    maxUnavailable: ""
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    targetCPU: 80
    targetMemory: 80
  networkPolicy:
    enabled: true
    allowExternal: true
    extraIngress: []
    extraEgress: 
      - ports:
          - port: 80
          - port: 443
          - port: 5432
          - port: 6379
          - port: 8080
          - port: 8793
          - port: 3307
          - port: 3306
    ingressNSMatchLabels: {}
    ingressNSPodMatchLabels: {}
git:
  image:
    registry: docker.io
    repository: bitnami/git
    tag: 2.43.0-debian-11-r9
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
  dags:
    enabled: false
    repositories:
      - repository: ""
        branch: ""
        name: ""
        path: ""
  plugins:
    enabled: false
    repositories:
      - repository: ""
        branch: ""
        name: ""
        path: ""
  clone:
    command: []
    args: []
    extraVolumeMounts: []
    extraEnvVars: []
    extraEnvVarsCM: ""
    extraEnvVarsSecret: ""
    resources: {}
  sync:
    interval: 60
    command: []
    args: []
    extraVolumeMounts: []
    extraEnvVars: []
    extraEnvVarsCM: ""
    extraEnvVarsSecret: ""
    resources: {}
ldap:
  enabled: false
  uri: "ldap://ldap_server:389"
  basedn: "dc=example,dc=org"
  searchAttribute: "cn"
  binddn: "cn=admin,dc=example,dc=org"
  bindpw: ""
  userRegistration: 'True'
  userRegistrationRole: "Public"
  rolesMapping: '{ "cn=All,ou=Groups,dc=example,dc=org": ["User"], "cn=Admins,ou=Groups,dc=example,dc=org": ["Admin"], }'
  rolesSyncAtLogin: 'True'
  tls:
    enabled: false
    allowSelfSigned: true
    certificatesSecret: ""
    certificatesMountPath: /opt/bitnami/airflow/conf/certs
    CAFilename: ""
service:
  type: LoadBalancer
  ports:
    http: 8080
  nodePorts:
    http: ""
  sessionAffinity: None
  sessionAffinityConfig: {}
  clusterIP: ""
  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  externalTrafficPolicy: Cluster
  annotations: {}
  extraPorts: []
ingress:
  enabled: true
  ingressClassName: "public"
  pathType: ImplementationSpecific
  apiVersion: ""
  hostname: airflow.local
  path: /
  annotations: 
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
    nginx.ingress.kubernetes.io/client_max_body_size : "50m"
    nginx.ingress.kubernetes.io/client-body-buffer-size: "50m" 
    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
  tls: true
  selfSigned: false
  extraHosts: []
  extraPaths: []
  extraTls: []
  secrets: []
  extraRules: []
serviceAccount:
  create: true
  name: ""
  automountServiceAccountToken: false
  annotations: {}
rbac:
  create: false
  rules: []
metrics:
  enabled: false
  image:
    registry: docker.io
    repository: bitnami/airflow-exporter
    tag: 0.20220314.0-debian-11-r447
    digest: ""
    pullPolicy: IfNotPresent
    pullSecrets: []
  extraEnvVars: []
  extraEnvVarsCM: ""
  extraEnvVarsSecret: ""
  containerPorts:
    http: 9112
  resources:
    limits: {}
    requests: {}
  podSecurityContext:
    enabled: true
    fsGroupChangePolicy: Always
    sysctls: []
    supplementalGroups: []
    fsGroup: 1001
  containerSecurityContext:
    enabled: true
    seLinuxOptions: null
    runAsUser: 1001
    runAsNonRoot: true
    privileged: false
    allowPrivilegeEscalation: false
    capabilities:
      drop: ["ALL"]
    seccompProfile:
      type: "RuntimeDefault"
  lifecycleHooks: {}
  automountServiceAccountToken: false
  hostAliases: []
  podLabels: {}
  podAnnotations: {}
  podAffinityPreset: ""
  podAntiAffinityPreset: soft
  nodeAffinityPreset:
    type: ""
    key: ""
    values: []
  affinity: {}
  nodeSelector: {}
  tolerations: []
  schedulerName: ""
  service:
    ports:
      http: 9112
    clusterIP: ""
    sessionAffinity: None
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "{{ .Values.metrics.service.ports.http }}"
  serviceMonitor:
    enabled: false
    namespace: ""
    interval: ""
    scrapeTimeout: ""
    labels: {}
    selector: {}
    relabelings: []
    metricRelabelings: []
    honorLabels: false
    jobLabel: ""
  networkPolicy:
    enabled: true
    allowExternal: true
    extraIngress: []
    extraEgress: []
    ingressNSMatchLabels: {}
    ingressNSPodMatchLabels: {}
postgresql:
  enabled: false
  auth:
    enablePostgresUser: true
    username: bn_airflow
    password: ""
    database: bitnami_airflow
    existingSecret: ""
  architecture: standalone
externalDatabase:
  host: airflow
  port: 5432
  user: postgres
  database: airflow
  password: airflow
  existingSecret: ""
  existingSecretPasswordKey: ""
redis:
  enabled: false
  auth:
    enabled: true
    password: ""
    existingSecret: ""
  architecture: standalone
externalRedis:
  host: redis-master.data-platform
  port: 6379
  username: ""
  password: airflow
  existingSecret: ""
  existingSecretPasswordKey: ""

What is the expected behavior?

Each page / style / js shouldn't take more than 10ms to load

What do you see instead?

Each load takes more than 5s

Additional information

No response

javsalgar commented 7 months ago

Hi,

Which is the platform you are using to deploy airflow? Maybe it requires more resources.

fzhan commented 7 months ago

@javsalgar the k8s cluster has 5 nodes, each 64GB x 2.3 20 core CPU.

I'm running other application such as superset, on the same platform, all loading fine, and the fact docker version of airflow (as indicated above) loads perfectly.

Mauraza commented 7 months ago

Hi @fzhan,

I found this issue in the airflow repo https://github.com/apache/airflow/issues/8907. Could you check it?

fzhan commented 7 months ago

@Mauraza i've gone through and implemented gevent, and turned off LoadBalancer back to ClusterIP. But it almost feel like the ingress is not telling browser these JS/CSS can be cached.

image

Below is the docker image, where everything loads in ms:

image

That's really strange

fzhan commented 7 months ago

@Mauraza I've added this snippet to ingress annotation and things starts to speed up (load from memory):

  annotations: 
...
    nginx.ingress.kubernetes.io/configuration-snippet: |
      if ($request_uri ~* \.(js|css|gif|jpe?g|png)) {
        expires 5d;
        add_header Cache-Control "public";
      }

Not sure if this is worth looking into but there should be some kind of gunicorn config which may / may not worked for requesting page resources.

Mauraza commented 7 months ago

Hi @fzhan,

It's strange... where are you running the chart?

fzhan commented 7 months ago

on a local microk8s cluster with 5 nodes.

None of the other packages with ingress had issue like this, especially those that uses gunicorn.

Mauraza commented 6 months ago

Hi @fzhan,

Appears some strange events in the kubectl describe?

fzhan commented 6 months ago
Name:                   airflow-web
Namespace:              airflow
CreationTimestamp:      Wed, 14 Feb 2024 00:28:29 +1100
Labels:                 app.kubernetes.io/component=web
                        app.kubernetes.io/instance=airflow
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=airflow
                        app.kubernetes.io/version=2.0.0
                        helm.sh/chart=airflow-16.8.2
Annotations:            deployment.kubernetes.io/revision: 3
                        meta.helm.sh/release-name: airflow
                        meta.helm.sh/release-namespace: airflow
Selector:               app.kubernetes.io/component=web,app.kubernetes.io/instance=airflow,app.kubernetes.io/name=airflow
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app.kubernetes.io/component=web
                    app.kubernetes.io/instance=airflow
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=airflow
                    app.kubernetes.io/version=2.0.0
                    helm.sh/chart=airflow-16.8.2
  Annotations:      checksum/configmap: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
  Service Account:  airflow
  Containers:
   airflow-web:
    Image:           docker.io/bitnami/airflow:2-debian-11
    Port:            8080/TCP
    Host Port:       0/TCP
    SeccompProfile:  RuntimeDefault
    Liveness:        tcp-socket :http delay=180s timeout=5s period=20s #success=1 #failure=6
    Readiness:       tcp-socket :http delay=30s timeout=5s period=10s #success=1 #failure=6
    Environment:
      AIRFLOW_FERNET_KEY:                          <set to the key 'airflow-fernet-key' in secret 'airflow'>  Optional: false
      AIRFLOW_SECRET_KEY:                          <set to the key 'airflow-secret-key' in secret 'airflow'>  Optional: false
      AIRFLOW_LOAD_EXAMPLES:                       no
      BASH_DEBUG:                                  1
      BITNAMI_DEBUG:                               true
      AIRFLOW_DATABASE_NAME:                       x
      AIRFLOW_DATABASE_USERNAME:                   x
      AIRFLOW_DATABASE_PASSWORD:                   <set to the key 'password' in secret 'airflow-externaldb'>  Optional: false
      AIRFLOW_DATABASE_HOST:                       postgresql-primary.postgres
      AIRFLOW_DATABASE_PORT_NUMBER:                5432
      REDIS_HOST:                                  redis-master.x
      REDIS_PORT_NUMBER:                           6379
      REDIS_PASSWORD:                              <set to the key 'redis-password' in secret 'airflow-externalredis'>  Optional: false
      AIRFLOW_EXECUTOR:                            CeleryExecutor
      AIRFLOW_WEBSERVER_HOST:                      0.0.0.0
      AIRFLOW_WEBSERVER_PORT_NUMBER:               8080
      AIRFLOW_USERNAME:                            x
      AIRFLOW_PASSWORD:                            <set to the key 'airflow-password' in secret 'airflow'>  Optional: false
      AIRFLOW_BASE_URL:                            http://x:8080
      AIRFLOW_LDAP_ENABLE:                         no
      AIRFLOW__WEBSERVER__WORKER_CLASS:            gevent
      AIRFLOW__CODE_EDITOR__ENABLED:               True
      AIRFLOW__CODE_EDITOR__ROOT_DIRECTORY:        /opt/bitnami/airflow/dags
      AIRFLOW__CODE_EDITOR__STRING_NORMALIZATION:  True
      AIRFLOW__CODE_EDITOR__MOUNT:                 name=logs,path=/opt/bitnami/airflow/logs
      _AIRFLOW_PATCH_GEVENT:                       1
      AIRFLOW__CORE__DEFAULT_TIMEZONE:             Australia/Melbourne
      AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE:     Australia/Melbourne
    Mounts:
      /bitnami/python/requirements.txt from requirements (rw,path="requirements.txt")
      /opt/bitnami/airflow/dags from airflow-dag (rw)
  Volumes:
   airflow-dag:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  airflow-dags
    ReadOnly:   false
   requirements:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      airflow-requirements
    Optional:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   airflow-web-69c54b9bf8 (1/1 replicas created)
Events:          <none>

Not really anything strange in the deployment @Mauraza

Mauraza commented 6 months ago

Did you check if it could be related to a DNS problem? (documentation )

github-actions[bot] commented 6 months ago

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] commented 6 months ago

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.