fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
404 stars 161 forks source link

HelmRelease does not reconcile from a ready=false state if the content of the ConfigMap in the valuesFrom field is changed. #1031

Closed piatroumaxim closed 1 month ago

piatroumaxim commented 1 month ago

Starting from version 2.2.3 (also on version 2.3.0), the HelmRelease does not reconcile from a ready=false state if the content of the ConfigMap in the valuesFrom field is changed. example manifest:

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: test-helmrelease
  namespace: default
spec:
  releaseName: test-helmrelease
  chart:
    spec:
      chart: chartname
      sourceRef:
        kind: HelmRepository
        name: helmreponame
        namespace: flux-system
      version: "1.2.*"
  interval: 1m0s
  targetNamespace: default
  timeout: 5m0s
  valuesFrom:
    - kind: ConfigMap
      name: test-configmap
      valuesKey: key
      targetPath: target.path.key
      optional: false
  values:
    some-values
    -//- 

If the status of this HelmRelease is ready=false (for example, due to a 5-minute timeout expiration or deployment pods not being in a ready state), changes in the test-configmap do not trigger the reconciliation process again. This issue does not occur when the status of the HelmRelease is ready=true. Reconciliation happens at the specified interval.

k get hr -n default test-helmrelease -o json | jq .status.conditions

[
  {
    "lastTransitionTime": "2024-07-17T09:15:18Z",
    "message": "Failed to upgrade after 1 attempt(s)",
    "observedGeneration": 28,
    "reason": "RetriesExceeded",
    "status": "True",
    "type": "Stalled"
  },
  {
    "lastTransitionTime": "2024-07-17T09:15:18Z",
    "message": "Helm upgrade failed for release default/test-helmrelease with chart chartname@1.2.5: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline",
    "observedGeneration": 28,
    "reason": "UpgradeFailed",
    "status": "False",
    "type": "Ready"
  },
  {
    "lastTransitionTime": "2024-07-17T09:15:18Z",
    "message": "Helm upgrade failed for release default/test-helmrelease with chart chartname@1.2.5: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline",
    "observedGeneration": 28,
    "reason": "UpgradeFailed",
    "status": "False",
    "type": "Released"
  }
]

any help would be appreciated

kingdonb commented 1 month ago

I think this intentional, based on the "Stalled" condition - when a HelmRelease goes into Stalled, that is supposed to indicate an unrecoverable failure. So it won't be reconciled again until there is a new generation. What causes a new generation is a change in the spec (or metadata?) of the resource.

Because you are using a direct reference to the configmap for your valuesFrom, Helm Controller does not "see" the change and so it does not reconcile again. You can force a reconcile now, this is a newer feature, but that's an imperative extra step and not ideal, what you're really meant to do is to use a configmapgenerator.

This will have the effect of updating the configmap's metadata (name) with a hash suffix that changes every time the content of the configmap changes. It will only work if the configmap is colocated with the helmrelease in Git (or whatever storage you're using to pull the HelmRelease yaml in to the cluster, OCI, Bucket, what have you...) it also updates the reference in the HelmRelease.spec which then creates a new generation, and Helm Controller will get out of Stalled condition and try again, triggered by the configmap change.

Is your configmap in the same repo as the HelmRelease definition, does that help?

Ref: https://fluxcd.io/flux/guides/helmreleases/#refer-to-values-in-configmaps-generated-with-kustomize (slack Ref: https://cloud-native.slack.com/archives/CLAJ40HV3/p1721308386490299)

piatroumaxim commented 1 month ago

not bug issue resolved in slak (slack Ref: https://cloud-native.slack.com/archives/CLAJ40HV3/p1721308386490299) thanks @kingdonb