loft-sh / loft

Namespace & Virtual Cluster Manager for Kubernetes - Lightweight Virtual Clusters, Self-Service Provisioning for Engineers and 70% Cost Savings with Sleep Mode
https://loft.sh/docs/introduction
Other
737 stars 65 forks source link

Helm: Loft deadlocks install if transient failure #169

Closed withinboredom closed 2 years ago

withinboredom commented 2 years ago

During a loft installation via helm, there was a transient failure:

Error: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://nginx-ingress-nginx-controller-admission.ingress.svc:443/networking/v1/ingresses?timeout=10s": context deadline exceeded

Rerunning the helm command just ends up with this error:

Error: failed pre-install: warning: Hook pre-install loft/templates/admin/user.yaml failed: object is being deleted: users.storage.loft.sh "admin" already exists

which results in a single loft-agent pod, no other loft pods.

Here's the values for the helm chart fwiw:

    admin:
      create: true
      username: admin
      password: PASSWORD
    ingress:
      enabled: true
      name: loft-ingress
      host: DOMAIN
      annotations:
        cert-manager.io/cluster-issuer: letsencrypt
      tls:
        enabled: true
        secret: tls-loft

loft start was able to resolve the issue.

FabianKramm commented 2 years ago

@withinboredom thanks for creating this issue! The problem is probably the finalizer on the user object, which is needed to cleanup user resources. We might think about running a cleanup hook though that removes things like that when you just use helm.

withinboredom commented 2 years ago

NP, you guys have built a fantastic product!

FabianKramm commented 2 years ago

@withinboredom thanks a lot for the kind words, glad you like it!

carlmontanari commented 2 years ago

I think that loft start sorted this out as it looks like it nukes the admin user prior to redeploying (via helm), so that probably aligns with the finalizer comment above. I think/hope this is OK (and is quite old) so I'll close this out, but please re-open if this is still an issue!