fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
398 stars 157 forks source link

Removing Helm release (the basic one) manually leads to HelmRelease stuck in "upgrade error" instead of re-installing #891

Open joaocc opened 6 months ago

joaocc commented 6 months ago

Describe the bug

After installing a HelmRelease (flux2), if we remove the corresponding "Helm" release (plain helm), flux2 seems to get stuck trying to upgrade a non-existing release.

Message
Helm upgrade failed for release my-ns/my-helmrel with chart my-chart@2.0.0-beta.61+264f3c0e57f2: "my-helmrel" has no deployed releases Last Helm logs: 2024-01-27T15:32:03.716328505Z: preparing upgrade for my-helmrel

Steps to reproduce

Deploy this HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: my-helmrel
  namespace: my-ns
spec:
  chart:
    spec:
      chart: ./lib/helm/my-chart
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: GitRepository
        name: my-git-repo
        namespace: flux-system
      version: '*'
  install:
    remediation:
      retries: 3
  interval: 5m
  rollback:
    cleanupOnFail: true
  upgrade:
    disableWait: true
    remediation:
      remediateLastFailure: false
      retries: 3
      strategy: rollback
values:
   (...)

Wait for flux to finish installing the corresponding Helm release (plain)

Remove the "my-helmrel" Helm release (the base one, not the HelmRelease)

Expected behavior

HelmRelease controller should detect the release is no longer installed, and install (instead of continuing to try to upgrade)

Screenshots and recordings

No response

OS / Distro

N/A

Flux version

N/A

Flux check

flux check ► checking prerequisites ✔ Kubernetes 1.28.2-eks-28c5e82 >=1.26.0-0 ► checking version in cluster ✔ distribution: flux-2.2.2 ✔ bootstrapped: false ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.37.2 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.37.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.31.1 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v1.2.1 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v1.2.3 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v1.2.3 ► checking crds ✔ alerts.notification.toolkit.fluxcd.io/v1beta3 ✔ buckets.source.toolkit.fluxcd.io/v1beta2 ✔ gitrepositories.source.toolkit.fluxcd.io/v1 ✔ helmcharts.source.toolkit.fluxcd.io/v1beta2 ✔ helmreleases.helm.toolkit.fluxcd.io/v2beta2 ✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2 ✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2 ✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2 ✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta1 ✔ kustomizations.kustomize.toolkit.fluxcd.io/v1 ✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2 ✔ providers.notification.toolkit.fluxcd.io/v1beta3 ✔ receivers.notification.toolkit.fluxcd.io/v1 ✔ all checks passed

Git provider

N/A

Container Registry provider

N/A

Additional context

{"level":"info","ts":"2024-01-27T16:02:04.197Z","msg":"HelmChart/flux-system/my-ns-my-helmrel with SourceRef 'GitRepository/flux-system/my-chart-resource' is in-sync","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"my-helmrel","namespace":"my-ns"},"namespace":"my-ns","name":"my-helmrel","reconcileID":"4e841bcf-5caf-40df-bef5-4ba060a45b6f"}
{"level":"info","ts":"2024-01-27T16:02:04.376Z","msg":"release not managed by controller: found existing release in storage","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"my-helmrel","namespace":"my-ns"},"namespace":"my-ns","name":"my-helmrel","reconcileID":"4e841bcf-5caf-40df-bef5-4ba060a45b6f"}
{"level":"info","ts":"2024-01-27T16:02:04.418Z","msg":"running 'upgrade' action with timeout of 5m0s","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"my-helmrel","namespace":"my-ns"},"namespace":"my-ns","name":"my-helmrel","reconcileID":"4e841bcf-5caf-40df-bef5-4ba060a45b6f"}
{"level":"error","ts":"2024-01-27T16:02:04.541Z","msg":"Reconciler error","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"my-helmrel","namespace":"my-ns"},"namespace":"my-ns","name":"my-helmrel","reconcileID":"4e841bcf-5caf-40df-bef5-4ba060a45b6f","error":"\"my-helmrel\" has no deployed releases"}

Code of Conduct

gruberdev commented 6 months ago

Correct if I am wrong, but on helm.toolkit.fluxcd.io/v2beta2 onwards you have to enable the spec.driftDetection to monitor the state of the underlying resources on HelmReleases.

Based on what you're describing, if it is not a HelmRelease and driftDetection is not enabled, the helm-controller won't be able to track and report its status after deployment, even if the resource is a Helm chart/release.

There's also the possibility of using a Kustomization to deploy your base Helm chart, using healthChecks to monitor the release health status, you can use both as far as I understand.

souleb commented 3 months ago

hello @joaocc, do you have this behavior on v2.2.3?

my test gives me:

{"level":"info","ts":"2024-04-17T12:26:30.618Z","msg":"release not managed by controller: release not observed to be made for object","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"ea659b95-6061-4849-9bfb-644f3ff360a7"}
{"level":"info","ts":"2024-04-17T12:26:30.639Z","msg":"running 'upgrade' action with timeout of 5m0s","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"ea659b95-6061-4849-9bfb-644f3ff360a7"}
{"level":"info","ts":"2024-04-17T12:26:31.116Z","msg":"release has not been tested","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"ea659b95-6061-4849-9bfb-644f3ff360a7"}
{"level":"info","ts":"2024-04-17T12:26:31.136Z","msg":"running 'test' action with timeout of 5m0s","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"ea659b95-6061-4849-9bfb-644f3ff360a7"}
{"level":"info","ts":"2024-04-17T12:27:04.698Z","msg":"release in-sync with desired state","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"podinfo","namespace":"default"},"namespace":"default","name":"podinfo","reconcileID":"ea659b95-6061-4849-9bfb-644f3ff360a7"}
joaocc commented 3 months ago

We didn't upgrade to 2.2.3 (we are using the pre-release patch hat includes helm 3.14) yet.

souleb commented 3 months ago

So it should work.

You should either have an upgrade if there is a previous release or a new install action. I tested with helm-controller v0.37.4.

If you still have the issue, please provide a reproducible example of your deletion steps.