kestra-io / helm-charts

Apache License 2.0
37 stars 26 forks source link

Error: INSTALLATION FAILED: failed post-install: timed out waiting for the condition #24

Closed smyja closed 8 months ago

smyja commented 11 months ago

Expected Behavior

Expected it to install and be accessible publicly

Actual Behaviour

i get Error: INSTALLATION FAILED: failed post-install: timed out waiting for the condition when i run helm install kestra kestra/kestra I increased timeout using this command and still got the same error. helm install kestra kestra/kestra --set startupapicheck.timeout=5m

when i run kubectl get pods -w I get

NAME                                   READY   STATUS             RESTARTS       AGE
kestra-minio-686d88c7bb-vdkz5          0/1     Pending            0              7m21s
kestra-minio-make-bucket-job-d9d6l     0/1     CrashLoopBackOff   4 (48s ago)    7m21s
kestra-postgresql-0                    1/1     Running            0              7m21s
kestra-standalone-77884789b7-r7sqf     2/2     Running            2 (7m3s ago)   7m21s

also there's no load balancer exposing it to the public when i runkubectl get service

NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE
kestra-minio           ClusterIP      10.0.5.205     <none>          9000/TCP                     13m
kestra-minio-console   ClusterIP      10.0.141.35    <none>          9001/TCP                     13m
kestra-postgresql      ClusterIP      10.0.20.52     <none>          5432/TCP                     13m
kestra-postgresql-hl   ClusterIP      None           <none>          5432/TCP                     13m
kestra-service         ClusterIP      10.0.14.78     <none>          8080/TCP                     13m
kubernetes             ClusterIP      10.0.0.1       <none>          443/TCP                      69m

Steps To Reproduce

No response

Environment Information

tchiotludo commented 11 months ago

@smyja can you provide the logs of the pod crash looping please? kestra-minio-make-bucket-job-d9d6l

smyja commented 11 months ago

@smyja can you provide the logs of the pod crash looping please? kestra-minio-make-bucket-job-d9d6l

maro [ ~ ]$ kubectl describe pod kestra-minio-make-bucket-job-cr2sf 
Name:             kestra-minio-make-bucket-job-cr2sf
Namespace:        default
Priority:         0
Service Account:  minio-sa
Node:             aks-nodepool1-39589112-vmss00000b/10.224.0.11
Start Time:       Sun, 10 Dec 2023 14:58:51 +0000
Labels:           app=minio-job
                  batch.kubernetes.io/controller-uid=87277c96-74d2-4f0e-9eda-cc698f6015e0
                  batch.kubernetes.io/job-name=kestra-minio-make-bucket-job
                  controller-uid=87277c96-74d2-4f0e-9eda-cc698f6015e0
                  job-name=kestra-minio-make-bucket-job
                  release=kestra
Annotations:      <none>
Status:           Running
IP:               10.244.5.3
IPs:
  IP:           10.244.5.3
Controlled By:  Job/kestra-minio-make-bucket-job
Containers:
  minio-mc:
    Container ID:  containerd://8d44faa8e0201da3251659d7eaacce6ef55b93c6a82d969f78090dfcb666831b
    Image:         quay.io/minio/mc:RELEASE.2022-10-20T23-26-33Z
    Image ID:      quay.io/minio/mc@sha256:50ee58bc9770131288a438a7e3b0ddce4f572a7e2438c7110646ed0817e65b1f
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      /config/initialize
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 10 Dec 2023 15:04:48 +0000
      Finished:     Sun, 10 Dec 2023 15:05:47 +0000
    Ready:          False
    Restart Count:  4
    Requests:
      memory:  128Mi
    Environment:
      MINIO_ENDPOINT:  kestra-minio
      MINIO_PORT:      9000
    Mounts:
      /config from minio-configuration (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fjxgn (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  minio-configuration:
    Type:                Projected (a volume that contains injected data from multiple sources)
    ConfigMapName:       kestra-minio
    ConfigMapOptional:   <nil>
    SecretName:          kestra-minio
    SecretOptionalName:  <nil>
  kube-api-access-fjxgn:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  7m55s                 default-scheduler  Successfully assigned default/kestra-minio-make-bucket-job-cr2sf to aks-nodepool1-39589112-vmss00000b
  Normal   Pulling    7m55s                 kubelet            Pulling image "quay.io/minio/mc:RELEASE.2022-10-20T23-26-33Z"
  Normal   Pulled     7m50s                 kubelet            Successfully pulled image "quay.io/minio/mc:RELEASE.2022-10-20T23-26-33Z" in 4.828436708s (4.828443008s including waiting)
  Normal   Created    118s (x5 over 7m50s)  kubelet            Created container minio-mc
  Normal   Started    118s (x5 over 7m50s)  kubelet            Started container minio-mc
  Normal   Pulled     118s (x4 over 6m30s)  kubelet            Container image "quay.io/minio/mc:RELEASE.2022-10-20T23-26-33Z" already present on machine
  Warning  BackOff    32s (x10 over 5m30s)  kubelet            Back-off restarting failed container minio-mc in pod kestra-minio-make-bucket-job-cr2sf_default(dcd75b89-634b-4817-938e-cc35d174562d)
maro [ ~ ]$ 

also heres the logs for the minio pod

^Cmaro [ ~ ]$ kubectl describe pod kestra-minio-686d88c7bb-wd79m 
Name:             kestra-minio-686d88c7bb-wd79m
Namespace:        default
Priority:         0
Service Account:  minio-sa
Node:             <none>
Labels:           app=minio
                  pod-template-hash=686d88c7bb
                  release=kestra
Annotations:      checksum/config: e649c2b2e83899705ea0d800d2b37606ff26572eb1fcd020c6fd57a3d79f6fdf
                  checksum/secrets: b65218564c458a2298c1ba50df8e5b468c0c9d1bb8ebb18f33ca4001030a12d5
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/kestra-minio-686d88c7bb
Containers:
  minio:
    Image:       quay.io/minio/minio:RELEASE.2022-10-24T18-35-07Z
    Ports:       9000/TCP, 9001/TCP
    Host Ports:  0/TCP, 0/TCP
    Command:
      /bin/sh
      -ce
      /usr/bin/docker-entrypoint.sh minio server /export -S /etc/minio/certs/ --address :9000 --console-address :9001
    Requests:
      memory:  16Gi
    Environment:
      MINIO_ROOT_USER:             <set to the key 'rootUser' in secret 'kestra-minio'>      Optional: false
      MINIO_ROOT_PASSWORD:         <set to the key 'rootPassword' in secret 'kestra-minio'>  Optional: false
      MINIO_PROMETHEUS_AUTH_TYPE:  public
    Mounts:
      /export from export (rw)
      /tmp/credentials from minio-user (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-htg8h (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  export:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  kestra-minio
    ReadOnly:   false
  minio-user:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kestra-minio
    Optional:    false
  kube-api-access-htg8h:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  16m                default-scheduler  0/10 nodes are available: 10 Insufficient memory. preemption: 0/10 nodes are available: 10 No preemption victims found for incoming pod..
  Warning  FailedScheduling  15m (x2 over 15m)  default-scheduler  0/10 nodes are available: 10 Insufficient memory. preemption: 0/10 nodes are available: 10 No preemption victims found for incoming pod..

I currently have 10 target nodes. i am wondering how many i need? Node pool capacity Virtual machine size Standard DS2 v2 (2 vcpus, 7 GiB memory) Cores 20 vCPUs Cores is updated to 20 vCPUs Memory 70 GiB

tchiotludo commented 11 months ago

can you add this to your values please:

minio:
  resources: 
    requests:
      memory:  1Gi

seems that minio default charts expect 16Gb minimum by pod :scream: by default

smyja commented 11 months ago

can you add this to your values please:

minio:
  resources: 
    requests:
      memory:  1Gi

seems that minio default charts expect 16Gb minimum by pod 😱 by default

Just noticed the chart has been updated and its running now. There's no load balancer or an external ip for me to see the dashboard, just cluster ips.

tchiotludo commented 11 months ago

yes you need to enabled it by yourself, we don't want to expose service automaticaly

smyja commented 11 months ago

yes you need to enabled it by yourself, we don't want to expose service automaticaly

Screenshot 2023-12-12 at 14 13 55

working, needed to expose the kestra-service and not minio

loicmathieu commented 8 months ago

@koorikla can you confirm that the chart is working correctly and nothing should be done at our side to improve it? If so I'll close it.

koorikla commented 8 months ago

@loicmathieu yes the chart is working on our side. Although its worth mentioning the bitnami minio sub chart didn't seem to have a option for assigning resources for the minio-make-bucket-job which can be a issue if the cluster has a admissionController policy that do not allow deploying objects without resources. So the minio spins up, but is unable to automatically create a bucket In our case we just used own central Minio for S3

As mentioned in your docs, for production grade setup users are supposed to use their own hardened databases anyways

loicmathieu commented 8 months ago

This should be reported at the bitnami side.

I'll close this issue, thanks for your feedback.