Open patsevanton opened 3 years ago
kubectl logs -n xxxxx airflow-webserver-86857b5969-sqkv6
Error from server (BadRequest): container "webserver" in pod "airflow-webserver-86857b5969-sqkv6" is waiting to start: PodInitializing
kubectl logs -n xxxxx airflow-postgresql-0
postgresql 05:56:01.18
postgresql 05:56:01.18 Welcome to the Bitnami postgresql container
postgresql 05:56:01.18 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
postgresql 05:56:01.18 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
postgresql 05:56:01.18 Send us your feedback at containers@bitnami.com
postgresql 05:56:01.19
postgresql 05:56:01.20 INFO ==> ** Starting PostgreSQL setup **
postgresql 05:56:01.23 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql 05:56:01.24 INFO ==> Loading custom pre-init scripts...
postgresql 05:56:01.24 INFO ==> Initializing PostgreSQL database...
postgresql 05:56:01.25 INFO ==> postgresql.conf file not detected. Generating it...
postgresql 05:56:01.25 INFO ==> pg_hba.conf file not detected. Generating it...
postgresql 05:56:02.32 INFO ==> Starting PostgreSQL in background...
postgresql 05:56:02.44 INFO ==> Changing password of postgres
postgresql 05:56:02.45 INFO ==> Configuring replication parameters
postgresql 05:56:02.47 INFO ==> Configuring fsync
postgresql 05:56:02.47 INFO ==> Loading custom scripts...
postgresql 05:56:02.48 INFO ==> Enabling remote connections
postgresql 05:56:02.48 INFO ==> Stopping PostgreSQL...
postgresql 05:56:03.49 INFO ==> ** PostgreSQL setup finished! **
postgresql 05:56:03.52 INFO ==> ** Starting PostgreSQL **
2021-04-13 05:56:03.537 GMT [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2021-04-13 05:56:03.537 GMT [1] LOG: listening on IPv6 address "::", port 5432
2021-04-13 05:56:03.556 GMT [1] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-04-13 05:56:03.586 GMT [178] LOG: database system was shut down at 2021-04-13 05:56:02 GMT
2021-04-13 05:56:03.596 GMT [1] LOG: database system is ready to accept connections
2021-04-13 05:56:10.476 GMT [193] LOG: incomplete startup packet
2021-04-13 05:56:12.106 GMT [194] LOG: incomplete startup packet
2021-04-13 05:57:20.415 GMT [284] LOG: incomplete startup packet
2021-04-13 05:57:22.399 GMT [286] LOG: incomplete startup packet
2021-04-13 05:58:44.731 GMT [397] LOG: incomplete startup packet
2021-04-13 05:58:45.741 GMT [398] LOG: incomplete startup packet
2021-04-13 06:00:17.733 GMT [533] LOG: incomplete startup packet
2021-04-13 06:00:18.752 GMT [534] LOG: incomplete startup packet
2021-04-13 06:02:18.723 GMT [703] LOG: incomplete startup packet
2021-04-13 06:02:21.740 GMT [714] LOG: incomplete startup packet
2021-04-13 06:04:51.723 GMT [917] LOG: incomplete startup packet
2021-04-13 06:05:01.784 GMT [933] LOG: incomplete startup packet
2021-04-13 06:08:53.728 GMT [1248] LOG: incomplete startup packet
2021-04-13 06:08:56.783 GMT [1256] LOG: incomplete startup packet
2021-04-13 06:15:15.739 GMT [1773] LOG: incomplete startup packet
2021-04-13 06:15:16.759 GMT [1780] LOG: incomplete startup packet
kubectl logs -n xxxxx airflow-scheduler-658d5d4454-r2sgl
error: a container name must be specified for pod airflow-scheduler-658d5d4454-r2sgl, choose one of: [scheduler scheduler-gc] or one of the init containers: [wait-for-airflow-migrations]
kubectl describe -n xxxxx pod airflow-scheduler-658d5d4454-r2sgl
Name: airflow-scheduler-658d5d4454-r2sgl
Namespace: xxxxx
Priority: 0
Node: ubuntu1804/192.168.22.7
Start Time: Tue, 13 Apr 2021 05:54:59 +0000
Labels: component=scheduler
pod-template-hash=658d5d4454
release=airflow
tier=airflow
Annotations: checksum/airflow-config: d84f720b402097e58a879efc896869845ec8bae56455470bf241221b2a016f19
checksum/extra-configmaps: 2e44e493035e2f6a255d08f8104087ff10d30aef6f63176f1b18f75f73295598
checksum/extra-secrets: bb91ef06ddc31c0c5a29973832163d8b0b597812a793ef911d33b622bc9d1655
checksum/metadata-secret: a954626eab69d09b0c9bfd44128c793948c18d943d9e97431903985654b350c5
checksum/pgbouncer-config-secret: da52bd1edfe820f0ddfacdebb20a4cc6407d296ee45bcb500a6407e2261a5ba2
checksum/result-backend-secret: af25d110685219c9219e6a4f9b268566118a4b732de33192387a111d1f241c89
cluster-autoscaler.kubernetes.io/safe-to-evict: true
Status: Pending
IP: 10.1.78.6
IPs:
IP: 10.1.78.6
Controlled By: ReplicaSet/airflow-scheduler-658d5d4454
Init Containers:
wait-for-airflow-migrations:
Container ID: containerd://ac2a25e781647e59aa341e5e308ebbef60408d69b1a2f6b5f2d83df808718ec2
Image: apache/airflow:2.0.0
Image ID: docker.io/apache/airflow@sha256:e973fef20d3be5b6ea328d2707ac87b90f680382790d1eb027bd7766699b2409
Port: <none>
Host Port: <none>
Args:
python
-c
import airflow
import logging
import os
import time
from alembic.config import Config
from alembic.runtime.migration import MigrationContext
from alembic.script import ScriptDirectory
from airflow import settings
package_dir = os.path.abspath(os.path.dirname(airflow.__file__))
directory = os.path.join(package_dir, 'migrations')
config = Config(os.path.join(package_dir, 'alembic.ini'))
config.set_main_option('script_location', directory)
config.set_main_option('sqlalchemy.url', settings.SQL_ALCHEMY_CONN.replace('%', '%%'))
script_ = ScriptDirectory.from_config(config)
timeout=60
with settings.engine.connect() as connection:
context = MigrationContext.configure(connection)
ticker = 0
while True:
source_heads = set(script_.get_heads())
db_heads = set(context.get_current_heads())
if source_heads == db_heads:
break
if ticker >= timeout:
raise TimeoutError("There are still unapplied migrations after {} seconds.".format(ticker))
ticker += 1
time.sleep(1)
logging.info('Waiting for migrations... %s second(s)', ticker)
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 13 Apr 2021 06:15:15 +0000
Finished: Tue, 13 Apr 2021 06:16:24 +0000
Ready: False
Restart Count: 7
Environment:
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
Containers:
scheduler:
Container ID:
Image: apache/airflow:2.0.0
Image ID:
Port: <none>
Host Port: <none>
Args:
bash
-c
exec airflow scheduler
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: exec [python -Wignore -c import os
os.environ['AIRFLOW__CORE__LOGGING_LEVEL'] = 'ERROR'
os.environ['AIRFLOW__LOGGING__LOGGING_LEVEL'] = 'ERROR'
from airflow.jobs.scheduler_job import SchedulerJob
from airflow.utils.db import create_session
from airflow.utils.net import get_hostname
import sys
with create_session() as session:
job = session.query(SchedulerJob).filter_by(hostname=get_hostname()).order_by(
SchedulerJob.latest_heartbeat.desc()).limit(1).first()
sys.exit(0 if job.is_alive() else 1)
] delay=10s timeout=5s period=30s #success=1 #failure=10
Environment:
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
Mounts:
/opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
/opt/airflow/logs from logs (rw)
/opt/airflow/pod_templates/pod_template_file.yaml from config (ro,path="pod_template_file.yaml")
/var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
scheduler-gc:
Container ID:
Image: apache/airflow:2.0.0
Image ID:
Port: <none>
Host Port: <none>
Args:
bash
/clean-logs
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/opt/airflow/logs from logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from airflow-scheduler-token-q6zfr (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-airflow-config
Optional: false
logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
airflow-scheduler-token-q6zfr:
Type: Secret (a volume populated by a Secret)
SecretName: airflow-scheduler-token-q6zfr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned xxxxx/airflow-scheduler-658d5d4454-r2sgl to ubuntu1804
Normal Pulling 24m kubelet Pulling image "apache/airflow:2.0.0"
Normal Pulled 24m kubelet Successfully pulled image "apache/airflow:2.0.0"
Normal Created 17m (x5 over 24m) kubelet Created container wait-for-airflow-migrations
Normal Started 17m (x5 over 24m) kubelet Started container wait-for-airflow-migrations
Normal Pulled 17m (x4 over 22m) kubelet Container image "apache/airflow:2.0.0" already present on machine
Warning BackOff 4m58s (x50 over 21m) kubelet Back-off restarting failed container
Same issue with namespace airflow
git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace airflow
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.7.xip.io --namespace airflow airflow ./
@mik-laj @kaxil Please view issue Big thanks!
How set debug in wait-for-airflow-migrations ? Thanks!
Check the logs of run-airflow-migrations
container in {{ .Release.Name }}-run-airflow-migrations
wait-for-airflow-migrations
just waits for the migration to be run.
kubectl logs -n airflow airflow-scheduler-78d9ffb5ff-5lw8f wait-for-airflow-migrations
DB_BACKEND=postgresql
DB_HOST=airflow-postgresql.airflow.svc.cluster.local
DB_PORT=5432
[2021-04-14 04:07:12,861] {migration.py:155} INFO - Context impl PostgresqlImpl.
[2021-04-14 04:07:12,862] {migration.py:162} INFO - Will assume transactional DDL.
[2021-04-14 04:07:18,838] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
[2021-04-14 04:07:20,086] {<string>:35} INFO - Waiting for migrations... 1 second(s)
[2021-04-14 04:07:21,090] {<string>:35} INFO - Waiting for migrations... 2 second(s)
[2021-04-14 04:07:22,093] {<string>:35} INFO - Waiting for migrations... 3 second(s)
[2021-04-14 04:07:23,095] {<string>:35} INFO - Waiting for migrations... 4 second(s)
[2021-04-14 04:07:24,097] {<string>:35} INFO - Waiting for migrations... 5 second(s)
[2021-04-14 04:07:25,100] {<string>:35} INFO - Waiting for migrations... 6 second(s)
[2021-04-14 04:07:26,102] {<string>:35} INFO - Waiting for migrations... 7 second(s)
[2021-04-14 04:07:27,104] {<string>:35} INFO - Waiting for migrations... 8 second(s)
[2021-04-14 04:07:28,107] {<string>:35} INFO - Waiting for migrations... 9 second(s)
[2021-04-14 04:07:29,109] {<string>:35} INFO - Waiting for migrations... 10 second(s)
[2021-04-14 04:07:30,111] {<string>:35} INFO - Waiting for migrations... 11 second(s)
[2021-04-14 04:07:31,114] {<string>:35} INFO - Waiting for migrations... 12 second(s)
[2021-04-14 04:07:32,116] {<string>:35} INFO - Waiting for migrations... 13 second(s)
[2021-04-14 04:07:33,118] {<string>:35} INFO - Waiting for migrations... 14 second(s)
[2021-04-14 04:07:34,121] {<string>:35} INFO - Waiting for migrations... 15 second(s)
[2021-04-14 04:07:35,124] {<string>:35} INFO - Waiting for migrations... 16 second(s)
[2021-04-14 04:07:36,126] {<string>:35} INFO - Waiting for migrations... 17 second(s)
[2021-04-14 04:07:37,129] {<string>:35} INFO - Waiting for migrations... 18 second(s)
[2021-04-14 04:07:38,131] {<string>:35} INFO - Waiting for migrations... 19 second(s)
[2021-04-14 04:07:39,134] {<string>:35} INFO - Waiting for migrations... 20 second(s)
[2021-04-14 04:07:40,136] {<string>:35} INFO - Waiting for migrations... 21 second(s)
[2021-04-14 04:07:41,139] {<string>:35} INFO - Waiting for migrations... 22 second(s)
[2021-04-14 04:07:42,141] {<string>:35} INFO - Waiting for migrations... 23 second(s)
[2021-04-14 04:07:43,143] {<string>:35} INFO - Waiting for migrations... 24 second(s)
[2021-04-14 04:07:44,145] {<string>:35} INFO - Waiting for migrations... 25 second(s)
[2021-04-14 04:07:45,148] {<string>:35} INFO - Waiting for migrations... 26 second(s)
[2021-04-14 04:07:46,150] {<string>:35} INFO - Waiting for migrations... 27 second(s)
[2021-04-14 04:07:47,152] {<string>:35} INFO - Waiting for migrations... 28 second(s)
[2021-04-14 04:07:48,154] {<string>:35} INFO - Waiting for migrations... 29 second(s)
[2021-04-14 04:07:49,157] {<string>:35} INFO - Waiting for migrations... 30 second(s)
[2021-04-14 04:07:50,159] {<string>:35} INFO - Waiting for migrations... 31 second(s)
[2021-04-14 04:07:51,161] {<string>:35} INFO - Waiting for migrations... 32 second(s)
[2021-04-14 04:07:52,162] {<string>:35} INFO - Waiting for migrations... 33 second(s)
[2021-04-14 04:07:53,164] {<string>:35} INFO - Waiting for migrations... 34 second(s)
[2021-04-14 04:07:54,166] {<string>:35} INFO - Waiting for migrations... 35 second(s)
[2021-04-14 04:07:55,168] {<string>:35} INFO - Waiting for migrations... 36 second(s)
[2021-04-14 04:07:56,170] {<string>:35} INFO - Waiting for migrations... 37 second(s)
[2021-04-14 04:07:57,172] {<string>:35} INFO - Waiting for migrations... 38 second(s)
[2021-04-14 04:07:58,175] {<string>:35} INFO - Waiting for migrations... 39 second(s)
[2021-04-14 04:07:59,177] {<string>:35} INFO - Waiting for migrations... 40 second(s)
[2021-04-14 04:08:00,180] {<string>:35} INFO - Waiting for migrations... 41 second(s)
[2021-04-14 04:08:01,182] {<string>:35} INFO - Waiting for migrations... 42 second(s)
[2021-04-14 04:08:02,185] {<string>:35} INFO - Waiting for migrations... 43 second(s)
[2021-04-14 04:08:03,187] {<string>:35} INFO - Waiting for migrations... 44 second(s)
[2021-04-14 04:08:04,189] {<string>:35} INFO - Waiting for migrations... 45 second(s)
[2021-04-14 04:08:05,192] {<string>:35} INFO - Waiting for migrations... 46 second(s)
[2021-04-14 04:08:06,194] {<string>:35} INFO - Waiting for migrations... 47 second(s)
[2021-04-14 04:08:07,196] {<string>:35} INFO - Waiting for migrations... 48 second(s)
[2021-04-14 04:08:08,199] {<string>:35} INFO - Waiting for migrations... 49 second(s)
[2021-04-14 04:08:09,201] {<string>:35} INFO - Waiting for migrations... 50 second(s)
[2021-04-14 04:08:10,203] {<string>:35} INFO - Waiting for migrations... 51 second(s)
[2021-04-14 04:08:11,206] {<string>:35} INFO - Waiting for migrations... 52 second(s)
[2021-04-14 04:08:12,208] {<string>:35} INFO - Waiting for migrations... 53 second(s)
[2021-04-14 04:08:13,211] {<string>:35} INFO - Waiting for migrations... 54 second(s)
[2021-04-14 04:08:14,212] {<string>:35} INFO - Waiting for migrations... 55 second(s)
[2021-04-14 04:08:15,215] {<string>:35} INFO - Waiting for migrations... 56 second(s)
[2021-04-14 04:08:16,217] {<string>:35} INFO - Waiting for migrations... 57 second(s)
[2021-04-14 04:08:17,219] {<string>:35} INFO - Waiting for migrations... 58 second(s)
[2021-04-14 04:08:18,222] {<string>:35} INFO - Waiting for migrations... 59 second(s)
[2021-04-14 04:08:19,224] {<string>:35} INFO - Waiting for migrations... 60 second(s)
Traceback (most recent call last):
File "<string>", line 32, in <module>
TimeoutError: There are still unapplied migrations after 60 seconds.
I faced this problem some days ago. However, I tried install airflow using scripts shown below today and it seems working .
#!/bin/bash -x
rm -rf airflow
git clone https://github.com/apache/airflow.git
cd airflow/chart
helm dependency update
helm install airflow . -n airflow
results
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
airflow-postgresql-0 1/1 Running 0 88s
airflow-scheduler-db9d85f4d-5nx6j 2/2 Running 0 88s
airflow-statsd-5556dc96bc-c24f5 1/1 Running 0 88s
airflow-webserver-f4f4cb77f-xwgcn 1/1 Running 0 88s
Logs of wait-for-airflow-maigrations container
$ kubectl logs -f airflow-scheduler-db9d85f4d-5nx6j wait-for-airflow-migrations
DB_BACKEND=postgresql
DB_HOST=airflow-postgresql.airflow.svc.cluster.local
DB_PORT=5432
........
[2021-04-14 05:16:38,136] {migration.py:155} INFO - Context impl PostgresqlImpl.
[2021-04-14 05:16:38,137] {migration.py:162} INFO - Will assume transactional DDL.
[2021-04-14 05:16:42,390] {opentelemetry_tracing.py:29} INFO - This service is instrumented using OpenTelemetry. OpenTelemetry could not be imported; please add opentelemetry-api and opentelemetry-instrumentation packages in order to get BigQuery Tracing data.
[2021-04-14 05:16:44,341] {<string>:35} INFO - Waiting for migrations... 1 second(s)
[2021-04-14 05:16:45,348] {<string>:35} INFO - Waiting for migrations... 2 second(s)
[2021-04-14 05:16:46,351] {<string>:35} INFO - Waiting for migrations... 3 second(s)
[2021-04-14 05:16:47,355] {<string>:35} INFO - Waiting for migrations... 4 second(s)
[2021-04-14 05:16:48,367] {<string>:35} INFO - Waiting for migrations... 5 second(s)
[2021-04-14 05:16:49,371] {<string>:35} INFO - Waiting for migrations... 6 second(s)
[2021-04-14 05:16:50,375] {<string>:35} INFO - Waiting for migrations... 7 second(s)
[2021-04-14 05:16:51,380] {<string>:35} INFO - Waiting for migrations... 8 second(s)
My environemnt
Apache Airflow version: master git
@patsevanton I asked for logs from different container in https://github.com/apache/airflow/issues/15340#issuecomment-818861093 😄 (names are a bit confusing)
Check the logs of run-airflow-migrations
(not wait-for-migration
) container in {{ .Release.Name }}-run-airflow-migrations
Now i cannot reproduce. Later please
@kaxil
kubectl logs -n xxxxx airflow-scheduler-0 run-airflow-migrations
error: container run-airflow-migrations is not valid for pod airflow-scheduler-0
kubectl logs -n xxxxx run-airflow-migrations
Error from server (NotFound): pods "run-airflow-migrations" not found
@kaxil
git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace apatsev
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.8.sslip.io --namespace apatsev airflow ./
pod not found
kubectl logs -n apatsev airflow-run-airflow-migrations
Error from server (NotFound): pods "airflow-run-airflow-migrations" not found
kubectl logs -n apatsev airflow-run-airflow-migrations run-airflow-migrations
Error from server (NotFound): pods "airflow-run-airflow-migrations" not found
https://github.com/apache/airflow/blob/6e31465a30dfd17e2e1409a81600b2e83c910036/chart/templates/migrate-database-job.yaml#L27 is kind of Job.
I dont have job
kubectl get all -A | grep Job
kubectl get all -A | grep job
FYI, have you tried set "wait" false? I found this works for me: https://forum.astronomer.io/t/run-airflow-migration-and-wait-for-airflow-migrations/1189/10
@LiboShen How add wait false to install I install airflow:
git clone https://github.com/apache/airflow.git
cd airflow/chart/
helm dependency update
kubectl create namespace apatsev
werf helm install --wait --set webserver.defaultUser.password=password,ingress.enabled=true,ingress.hosts[0]=airflow.192.168.22.8.sslip.io --namespace apatsev airflow ./
Create file or add option?
This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
This issue has been closed because it has not received response from the issue author.
I'm facing the same issue. I don't ever get any pod or job containing run-airflow-migrations, and consequently the wait never ends. Is there a solution for this? I'm not running terraform and neither is patsevanto. He is using werf, I'm using flux.
tldr;
@autarchprinceps I'm having the same trouble here. When deploying the chart to our local machines all runs fine. But deploying the chart to our cluster using helm does not create the run-airflow-migrations and create-airflow-user job.
You have to set the --wait flag with helm!
I also ran into this issue. In the interest of saving time for anyone else that stumbles upon this issue, the fix seems to be setting --wait=false
on the Helm command, per @LiboShen's advice.
On Rancher you can un-check "Wait" on the final page before deploying. I'm sure OpenShift and other solutions have similar options for exposing the underlying Helm --wait
flag.
I can confirm that this worked on Rancher v2.6.1 installing Airflow to a downstream cluster provisioned by RKE running Kubernetes v1.21.5.
I'm using ArgoCD to deploy the Helm chart, tearing out my hair trying every possible variation, but I'm also not seeing the run-airflow-migrations
pod, it doesn't run or show up. So my webserver and scheduler wait forever for the migrations that never started.
Not sure how to set the --wait=false
param using Argo. I tried argocd app set [my-app] --helm-set-string wait=false
but doesn't seem to do anything.
So I'm stuck.
@yehoshuadimarsky Did you find the way to do this? I am also trying to implement the exact same thing and have been struggling to get the issue fixed.
@yehoshuadimarsky Did you find the way to do this? I am also trying to implement the exact same thing and have been struggling to get the issue fixed.
Yes! I finally got this to work: put this in your values.yaml
override:
# per https://github.com/apache/airflow/pull/16291
# and https://github.com/apache/airflow/pull/16331
createUserJob:
jobAnnotations:
"argocd.argoproj.io/hook": Sync
"argocd.argoproj.io/sync-wave": "0"
"argocd.argoproj.io/hook-delete-policy": BeforeHookCreation,HookSucceeded
migrateDatabaseJob:
jobAnnotations:
"argocd.argoproj.io/hook": Sync
"argocd.argoproj.io/sync-wave": "0"
"argocd.argoproj.io/hook-delete-policy": BeforeHookCreation,HookSucceeded
Thanks! I tried adding this annotation in my values.yaml file, but for some reason does not seem to work. When I add this annotation, my application fails with validation error and values.yaml file does not even get loaded in the ArgoCD UI. And it shows the line no. as error where I am adding the annotation. May be I am missing something, not sure. Is there any specific version of charts/argo-cd I am suppose to use to get this working?
I'm seeing the same issue when deploying using helm to k8s.
Using versions:
It seems that these hooks were recently made configurable: https://github.com/apache/airflow/blob/main/chart/values.yaml#L632
I just ran a quick test and indeed the job is now scheduled/run and after completion the scheduler/webserver pod are spinning.
I don't fully understand why/how it was build this way...?
Paul
I'm seeing the same issue when deploying using helm to k8s.
Using versions:
- chart: 1.3.0
- app: 2.2.1
It seems that these hooks were recently made configurable: https://github.com/apache/airflow/blob/main/chart/values.yaml#L632
I just ran a quick test and indeed the job is now scheduled/run and after completion the scheduler/webserver pod are spinning.
I don't fully understand why/how it was build this way...?
Paul
Cool, I didn't know this was added recently, this should solve the problem really nicely
https://github.com/apache/airflow/blob/main/chart/values.yaml#L632
Make sure you put it in the correct parts of the YAML file. I was referring to the jobAnnotations
of each of the migrate Jobs, such as here
and here
Workaround I've used:
seeing same issue
confirmed this workaround works (im using terraform to run the chart)
wait = "false"
set {
name = "airflow.dbMigrations.runAsJob"
value = "true"
}
Anyone using ArgoCD to deploy the Airflow Helm chart who reaches this issue: read this piece of documentation.
When installing the chart using ArgoCD, you MUST set the two following values, or your application will not start as the migrations will not be run:
createUserJob.useHelmHooks: false migrateDatabaseJob.useHelmHooks: false
In my case, just removing the --wait flag was enough, I didn't have to fiddle with airflow.dbMigrations.runAsJob
I came to this issue upgrading to airflow 2.3.0 via helm chart 1.6.0. The issue turned out to be the migration job not being able to schedule due to a too low CPU request limit on the k8s namespace. Just another thing to check if you end up here like I did.
@dingobar Thank you for the input. I am using the same versions and having (maybe similar) migration issue. May I ask you to elaborate on the solution?
@dingobar Thank you for the input. I am using the same versions and having (maybe similar) migration issue. May I ask you to elaborate on the solution?
Run kubectl -n <namespace> get events
and see if there are events where some things are not being scheduled. If not then also try a kubectl -n <namespace> describe replicaset <migration job replicaset>
and see if there are events there that can give you a clue. In my case, the scheduler could not start the migration pod due to a limit in how much CPU the namespace could request in total. You can see the limits by describing the namespace, kubectl describe namespace <namespace>
. Hope that helps.
@dingobar - I'm also facing your same resource limit issues during my Airflow Helm deployment in K8s cluster. It looks like the used resources range is higher than the resource limit in my namespace. May I ask you for a solution, if you have fixed it?
Apache Airflow version:
master git
Kubernetes version (if you are using kubernetes) (use
kubectl version
):Environment:
What happened:
Log
Next line log
Full log https://gist.github.com/patsevanton/0edd5571cf69aa539edcdb803c288061