Closed raffOps closed 2 years ago
@rjribeiro what version of airflow itself are you using?
NOTE: Multiple schedulers is only supported in airflow 2.0+.
@rjribeiro what version of airflow itself are you using?
NOTE: Multiple schedulers is only supported in airflow 2.0+.
2.1.3
@rjribeiro Is something specifically not working?
I believe "could not obtain lock"
is NOT actually an error, but part of the system deciding which scheduler is currently "active" for some tasks which cannot be shared.
The PR https://github.com/apache/airflow/pull/19842 has removed this error to help users not get confused (its not in any released versions of airflow yet, however).
Also see a similar user's question in: https://giters.com/apache/airflow/issues/19811
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
PSA: this is fixed in airflow 2.3 https://github.com/apache/airflow/pull/19842/
What is the bug?
Yesterday I configured the web, worker and scheduler workloads to have 2 replicas, with podDisruptionBudget with minAvailable: "1". Since then I've had the following error log in the db:
db=hml,user=postgres ERROR: could not obtain lock on row in relation "slot_pool"
What version of the chart are you using?
I am using version
8.2.0
of this chart.What version of Kubernetes are you using?
What version of Helm are you using?
What are your custom helm values?
click to expand
```yaml ## enable this value if you pass `--wait` to your `helm install` ## helmWait: false ################################### # Airflow - Common Configs ################################### airflow: ## if we use legacy 1.10 airflow commands ## legacyCommands: false ## configs for the airflow container image ## image: repository: gcr.io/xxxxxx/airflow tag: {{ ENV }} ## values: Always or IfNotPresent pullPolicy: Always pullSecret: "" uid: 50000 gid: 50000 executor: CeleryExecutor ## environment variables for airflow configs ## ## NOTE: ## - config docs: https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html ## - airflow configs env-vars are structured: "AIRFLOW__{config_section}__{config_name}" ## # EXAMPLE: # config: # ## dags # AIRFLOW__CORE__LOAD_EXAMPLES: "False" # AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "30" # # ## email # # ## domain used in airflow emails ## AIRFLOW__WEBSERVER__BASE_URL: "http://airflow.example.com" ## ## ## ether environment variables ## HTTP_PROXY: "http://proxy.example.com:8080" ## config: AIRFLOW__WEBSERVER__EXPOSE_CONFIG: "False" AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: "30" PYTHONPATH: $PYTHONPATH:/opt/airflow/dags:/opt/airflow/dags/repo AIRFLOW__WEBSERVER__BASE_URL: "http://{{ IP_AIRFLOW }}/{{ ENV }}" AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT: 'google-cloud-platform://' AIRFLOW__CORE__LOAD_EXAMPLES: "False" ## remote log storage AIRFLOW__LOGGING__REMOTE_LOGGING: "True" AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "gs://xxxxxxxx-airflow-log/{{ ENV }}" AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "google_cloud_default" #smtps AIRFLOW__EMAIL__EMAIL_BACKEND: "airflow.utils.email.send_email_smtp" AIRFLOW__SMTP__SMTP_HOST: "smtp.gmail.com" AIRFLOW__SMTP__SMTP_STARTTLS: "True" AIRFLOW__SMTP__SMTP_SSL: "False" AIRFLOW__SMTP__SMTP_PORT: "587" ## a list of initial users to create ## users: - username: admin password: xxxxxxxxxxxxxx role: Admin email: xxxxxxxx firstName: Dados lastName: xxxxxxxxxxxxxxxx ## if we update users or just create them the first time (lookup by `username`) ## ## NOTE: ## - if enabled, the chart will revert any changes made in the web-ui to users defined ## in `users` (including passwords) ## usersUpdate: True variables: - key: "env" value: {{ ENV }} - key: "localizacao_cluster" value: {{ LOCATION }} ## if we update variables or just create them the first time (lookup by `key`) ## ## NOTE: ## - if enabled, the chart will revert any changes made in the web-ui to variables ## defined in `variables` ## variablesUpdate: false ## extra environment variables for the web/scheduler/worker/flower Pods ## ## SPEC - EnvVar: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#envvar-v1-core ## extraEnv: - name: PYTHONPATH value: "/opt/airflow/dags:/opt/airflow/dags/repo" - name: AIRFLOW__WEBSERVER__SECRET_KEY valueFrom: secretKeyRef: name: airflow key: airflow-webserver-key - name: AIRFLOW__CORE__FERNET_KEY valueFrom: secretKeyRef: name: airflow key: airflow-fernet-key - name: AIRFLOW__SMTP__SMTP_USER valueFrom: secretKeyRef: name: airflow key: stmp-mailfrom - name: AIRFLOW__SMTP__SMTP_MAIL_FROM valueFrom: secretKeyRef: name: airflow key: stmp-mailfrom - name: AIRFLOW__SMTP__SMTP_PASSWORD valueFrom: secretKeyRef: name: airflow key: smtp-password - name: AIRFLOW__VAR_PROXY-TOKEN valueFrom: secretKeyRef: name: airflow key: proxy-token ################################### # Airflow - Scheduler Configs ################################### scheduler: ## the number of scheduler Pods to run ## ## NOTE: ## - if you set this >1 we recommend defining a `scheduler.podDisruptionBudget` ## replicas: 2 ## resource requests/limits for the scheduler Pod ## ## SPEC - ResourceRequirements: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#resourcerequirements-v1-core ## resources: limits: cpu: 1 memory: 0.5G requests: cpu: 1 memory: 0.5G ## the nodeSelector configs for the scheduler Pods ## ## DOCS: ## https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector ## nodeSelector: {} ## the affinity configs for the scheduler Pods ## ## SPEC - Affinity: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#affinity-v1-core ## affinity: {} ## the toleration configs for the scheduler Pods ## ## SPEC - Toleration: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#toleration-v1-core ## tolerations: [] ## the security context for the scheduler Pods ## ## SPEC - SecurityContext: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#securitycontext-v1-core ## securityContext: {} ## labels for the scheduler Deployment ## labels: {} ## Pod labels for the scheduler Deployment ## podLabels: {} ## annotations for the scheduler Deployment ## annotations: {} ## Pod annotations for the scheduler Deployment ## podAnnotations: {} ## if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" ## safeToEvict: true ## configs for the PodDisruptionBudget of the scheduler ## podDisruptionBudget: ## if a PodDisruptionBudget resource is created for the scheduler ## enabled: true ## the maximum unavailable pods/percentage for the scheduler ## maxUnavailable: "" ## the minimum available pods/percentage for the scheduler ## minAvailable: "1" ## sets `airflow --num_runs` parameter used to run the airflow scheduler ## numRuns: -1 ## configs for the scheduler Pods' liveness probe ## ## NOTE: ## - `periodSeconds` x `failureThreshold` = max seconds a scheduler can be unhealthy ## livenessProbe: enabled: true initialDelaySeconds: 10 periodSeconds: 30 timeoutSeconds: 10 failureThreshold: 5 ## extra pip packages to install in the scheduler Pods ## ## EXAMPLE: ## extraPipPackages: ## - "SomeProject==1.0.0" ## extraPipPackages: [] ## extra VolumeMounts for the scheduler Pods ## ## SPEC - VolumeMount: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volumemount-v1-core ## extraVolumeMounts: [] ## extra Volumes for the scheduler Pods ## ## SPEC - Volume: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volume-v1-core ## extraVolumes: [] ## extra init containers to run in the scheduler Pods ## ## SPEC - Container: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#container-v1-core ## extraInitContainers: [] ################################### # Airflow - WebUI Configs ################################### web: ## configs to generate webserver_config.py ## webserverConfig: ## the full text value to mount as the webserver_config.py file ## ## NOTE: ## - if set, will override all values except `webserverConfig.existingSecret` ## ## EXAMPLE: ## stringOverride: |- ## from airflow import configuration as conf ## from flask_appbuilder.security.manager import AUTH_DB ## ## # the SQLAlchemy connection string ## SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN') ## ## # use embedded DB for auth ## AUTH_TYPE = AUTH_DB ## stringOverride: "" ## the name of a pre-created secret containing a `webserver_config.py` file as a key ## existingSecret: "" ## the number of web Pods to run ## ## NOTE: ## - if you set this >1 we recommend defining a `web.podDisruptionBudget` ## replicas: 2 ## resource requests/limits for the web Pod ## ## SPEC - ResourceRequirements: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#resourcerequirements-v1-core ## resources: limits: cpu: 0.3 memory: 2G requests: cpu: 0.3 memory: 2G ## the nodeSelector configs for the web Pods ## ## DOCS: ## https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector ## nodeSelector: {} ## the affinity configs for the web Pods ## ## SPEC - Affinity: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#affinity-v1-core ## affinity: {} ## the toleration configs for the web Pods ## ## SPEC - Toleration: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#toleration-v1-core ## tolerations: [] ## the security context for the web Pods ## ## SPEC - SecurityContext: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#securitycontext-v1-core ## securityContext: {} ## labels for the web Deployment ## labels: {} ## Pod labels for the web Deployment ## podLabels: {} ## annotations for the web Deployment ## annotations: {} ## Pod annotations for the web Deployment ## podAnnotations: {} ## if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" ## safeToEvict: true ## configs for the PodDisruptionBudget of the web Deployment ## podDisruptionBudget: ## if a PodDisruptionBudget resource is created for the web Deployment ## enabled: true ## the maximum unavailable pods/percentage for the web Deployment ## maxUnavailable: "" ## the minimum available pods/percentage for the web Deployment ## minAvailable: "1" ## configs for the Service of the web Pods ## service: annotations: networking.gke.io/load-balancer-type: "Internal" sessionAffinity: "None" sessionAffinityConfig: {} type: LoadBalancer externalPort: 8080 loadBalancerIP: {{ IP_AIRFLOW }} loadBalancerSourceRanges: [] nodePort: http: 32080 ## configs for the web Pods' readiness probe ## readinessProbe: enabled: true initialDelaySeconds: periodSeconds: 60 timeoutSeconds: 5 failureThreshold: 6 ## configs for the web Pods' liveness probe ## livenessProbe: enabled: true initialDelaySeconds: 60 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 6 ## extra pip packages to install in the web Pods ## ## EXAMPLE: ## extraPipPackages: ## - "SomeProject==1.0.0" ## extraPipPackages: [] ## extra VolumeMounts for the web Pods ## ## SPEC - VolumeMount: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volumemount-v1-core ## extraVolumeMounts: [] ## extra Volumes for the web Pods ## ## SPEC - Volume: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volume-v1-core ## extraVolumes: [] ################################### # Airflow - Celery Worker Configs ################################### workers: ## if the airflow workers StatefulSet should be deployed ## enabled: true ## the number of worker Pods to run ## ## NOTE: ## - if you set this >1 we recommend defining a `workers.podDisruptionBudget` ## - this is the minimum when `workers.autoscaling.enabled` is true ## replicas: 2 ## resource requests/limits for the worker Pod ## ## SPEC - ResourceRequirements: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#resourcerequirements-v1-core ## resources: limits: cpu: 0.3 memory: 1.5G requests: cpu: 0.3 memory: 1.5G ## the nodeSelector configs for the worker Pods ## ## DOCS: ## https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector ## nodeSelector: {} ## the affinity configs for the worker Pods ## ## SPEC - Affinity: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#affinity-v1-core ## affinity: {} ## the toleration configs for the worker Pods ## ## SPEC - Toleration: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#toleration-v1-core ## tolerations: [] ## the security context for the worker Pods ## ## SPEC - SecurityContext: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#securitycontext-v1-core ## securityContext: {} ## labels for the worker StatefulSet ## labels: {} ## Pod labels for the worker StatefulSet ## podLabels: {} ## annotations for the worker StatefulSet ## annotations: {} ## Pod annotations for the worker StatefulSet ## podAnnotations: {} ## if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" ## safeToEvict: true ## configs for the PodDisruptionBudget of the worker StatefulSet ## podDisruptionBudget: ## if a PodDisruptionBudget resource is created for the worker StatefulSet ## enabled: true ## the maximum unavailable pods/percentage for the worker StatefulSet ## maxUnavailable: "" ## the minimum available pods/percentage for the worker StatefulSet ## minAvailable: "1" ## configs for the HorizontalPodAutoscaler of the worker Pods ## ## NOTE: ## - if using git-sync, ensure `dags.gitSync.resources` is set ## ## EXAMPLE: ## autoscaling: ## enabled: true ## maxReplicas: 16 ## metrics: ## - type: Resource ## resource: ## name: memory ## target: ## type: Utilization ## averageUtilization: 80 ## autoscaling: enabled: true maxReplicas: 4 metrics: - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 ## configs for the celery worker Pods ## celery: ## if celery worker Pods are gracefully terminated ## ## graceful termination process: ## 1. prevent worker accepting new tasks ## 2. wait AT MOST `workers.celery.gracefullTerminationPeriod` for tasks to finish ## 3. send SIGTERM to worker ## 4. wait AT MOST `workers.terminationPeriod` for kill to finish ## 5. send SIGKILL to worker ## ## NOTE: ## - consider defining a `workers.podDisruptionBudget` to prevent there not being ## enough available workers during graceful termination waiting periods ## gracefullTermination: false ## how many seconds to wait for tasks to finish before SIGTERM of the celery worker ## gracefullTerminationPeriod: 600 ## how many seconds to wait after SIGTERM before SIGKILL of the celery worker ## ## WARNING: ## - tasks that are still running during SIGKILL will be orphaned, this is important ## to understand with KubernetesPodOperator(), as Pods may continue running ## terminationPeriod: 60 ## extra pip packages to install in the worker Pod ## ## EXAMPLE: ## extraPipPackages: ## - "SomeProject==1.0.0" ## extraPipPackages: [] ## extra VolumeMounts for the worker Pods ## ## SPEC - VolumeMount: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volumemount-v1-core ## extraVolumeMounts: [] ## extra Volumes for the worker Pods ## ## SPEC - Volume: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volume-v1-core ## extraVolumes: [] ################################### # Airflow - Flower Configs ################################### flower: ## if the airflow flower UI should be deployed ## enabled: false ## the number of flower Pods to run ## ## NOTE: ## - if you set this >1 we recommend defining a `flower.podDisruptionBudget` ## replicas: 1 ## resource requests/limits for the flower Pod ## ## SPEC - ResourceRequirements: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#resourcerequirements-v1-core ## resources: {} ## the nodeSelector configs for the flower Pods ## ## DOCS: ## https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector ## nodeSelector: {} ## the affinity configs for the flower Pods ## ## SPEC - Affinity: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#affinity-v1-core ## affinity: {} ## the toleration configs for the flower Pods ## ## SPEC - Toleration: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#toleration-v1-core ## tolerations: [] ## the security context for the flower Pods ## ## SPEC - SecurityContext: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#securitycontext-v1-core ## securityContext: {} ## labels for the flower Deployment ## labels: {} ## Pod labels for the flower Deployment ## podLabels: {} ## annotations for the flower Deployment ## annotations: {} ## Pod annotations for the flower Deployment ## podAnnotations: {} ## if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" ## safeToEvict: true ## configs for the PodDisruptionBudget of the flower Deployment ## podDisruptionBudget: ## if a PodDisruptionBudget resource is created for the flower Deployment ## enabled: false ## the maximum unavailable pods/percentage for the flower Deployment ## maxUnavailable: "" ## the minimum available pods/percentage for the flower Deployment ## minAvailable: "" ## the value of the flower `--auth` argument ## ## NOTE: ## - see flower docs: https://flower.readthedocs.io/en/latest/auth.html#google-oauth-2-0 ## oauthDomains: "" ## the name of a pre-created secret containing the basic authentication value for flower ## ## NOTE: ## - this will override any value of `config.AIRFLOW__CELERY__FLOWER_BASIC_AUTH` ## basicAuthSecret: "" ## the key within `flower.basicAuthSecret` containing the basic authentication string ## basicAuthSecretKey: "" ## configs for the Service of the flower Pods ## service: annotations: {} type: ClusterIP externalPort: 5555 loadBalancerIP: "" loadBalancerSourceRanges: [] nodePort: http: ## configs for the flower Pods' readinessProbe probe ## readinessProbe: enabled: true initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 6 ## configs for the flower Pods' liveness probe ## livenessProbe: enabled: true initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 6 ## extra pip packages to install in the flower Pod ## ## EXAMPLE: ## extraPipPackages: ## - "SomeProject==1.0.0" ## extraPipPackages: [] ## extra VolumeMounts for the flower Pods ## ## SPEC - VolumeMount: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volumemount-v1-core ## extraVolumeMounts: [] ## extra Volumes for the flower Pods ## ## SPEC - Volume: ## https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#volume-v1-core ## extraVolumes: [] ################################### # Airflow - Logs Configs ################################### logs: ## the airflow logs folder ## path: /opt/airflow/logs ## configs for the logs PVC ## persistence: ## if a persistent volume is mounted at `logs.path` ## enabled: false ## the name of an existing PVC to use ## existingClaim: "" ## sub-path under `logs.persistence.existingClaim` to use ## subPath: "" ## the name of the StorageClass used by the PVC ## ## NOTE: ## - if set to "", then `PersistentVolumeClaim/spec.storageClassName` is omitted ## - if set to "-", then `PersistentVolumeClaim/spec.storageClassName` is set to "" ## storageClass: "" ## the access mode of the PVC ## ## WARNING: ## - must be "ReadWriteMany" or airflow pods will fail to start ## - different StorageClass types support different access modes: ## https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes ## accessMode: ReadWriteMany ## the size of PVC to request ## size: 1Gi ################################### # Airflow - DAGs Configs ################################### dags: ## the airflow dags folder ## path: /opt/airflow/dags ## configs for the dags PVC ## persistence: ## if a persistent volume is mounted at `dags.path` ## enabled: false ## the name of an existing PVC to use ## existingClaim: "" ## sub-path under `dags.persistence.existingClaim` to use ## subPath: "" ## the name of the StorageClass used by the PVC ## ## NOTE: ## - if set to "", then `PersistentVolumeClaim/spec.storageClassName` is omitted ## - if set to "-", then `PersistentVolumeClaim/spec.storageClassName` is set to "" ## storageClass: "" ## the access mode of the PVC ## ## NOTE: ## - must be "ReadOnlyMany" or "ReadWriteMany" or airflow pods will fail to start ## - different StorageClass types support different access modes: ## https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes ## accessMode: ReadOnlyMany ## the size of PVC to request ## size: 1Gi ## configs for the git-sync sidecar (https://github.com/kubernetes/git-sync) ## gitSync: ## if the git-sync sidecar container is enabled ## enabled: true ## the git-sync container image ## image: repository: k8s.gcr.io/git-sync/git-sync tag: v3.2.2 ## values: Always or IfNotPresent pullPolicy: IfNotPresent uid: 65533 gid: 65533 ## resource requests/limits for the git-sync container resources: limits: cpu: 1 memory: 2000Mi requests: cpu: 512m memory: 1000Mi ## the url of the git repo ## ## EXAMPLE - HTTPS: ## repo: "https://github.com/USERNAME/REPOSITORY.git" ## ## EXAMPLE - SSH: ## repo: "git@github.com:USERNAME/REPOSITORY.git" ## repo: "git@gitlab.com:xxxxxxxxxxxxxxxxxxxxx/dados/dags.git" ## the sub-path (within your repo) where dags are located ## ## NOTE: ## - only dags under this path (within your repo) will be seen by airflow, ## but the full repo will be cloned ## repoSubPath: "dags" ## the git branch to check out ## branch: {{ ENV }} ## the git revision (tag or hash) to check out ## revision: HEAD ## shallow clone with a history truncated to the specified number of commits ## depth: 1 ## the number of seconds between syncs ## syncWait: 15 ## the max number of seconds allowed for a complete sync ## syncTimeout: 120 ## the name of a pre-created Secret with git http credentials ## httpSecret: "" ## the key in `dags.gitSync.httpSecret` with your git username ## httpSecretUsernameKey: "" ## the key in `dags.gitSync.httpSecret` with your git password/token ## httpSecretPasswordKey: "" ## the name of a pre-created Secret with git ssh credentials ## sshSecret: "airflow-git-sync" ## the key in `dags.gitSync.sshSecret` with your ssh-key file ## sshSecretKey: id_rsa ## the string value of a "known_hosts" file (for SSH only) ## ## WARNING: ## - known_hosts verification will be disabled if left empty, making you more ## vulnerable to repo spoofing attacks ## ## EXAMPLE: ## sshKnownHosts: |- ##