fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
398 stars 157 forks source link

[BUG] Helm drift detection on configmap patching '*** (after)' instead of the actual template from the helm chart #925

Closed soumyabk closed 3 months ago

soumyabk commented 4 months ago

Hi Folks, we're running helmcontroller version v0.37.2 and enabled drift detection. Noticing that if any manual change was added to one of the configmaps of our helm chart, the helm controller detects a drifts and patches with a string and not the actual config map from the helm template.

{"level":"debug","ts":"2024-03-26T00:16:19.027Z","msg":"resource modified","controller":"helmrelease","controllerGroup":"helm.toolkit.fluxcd.io","controllerKind":"HelmRelease","HelmRelease":{"name":"<resource_name>","namespace":"flux-system"},"namespace":"flux-system","name":"<servicename>","reconcileID":"94f4e548-7273-44b3-a316-859516ec5dbe","resource":"ConfigMap/flux-system/<service>-config","patch":[{"value":"*** (after)","op":"replace","path":"/data/config.yaml"}]}

Config map after reconcile:

apiVersion: v1
data:
  config.yaml: '*** (after)'
kind: ConfigMap
metadata:
********

configmap template sample:

apiVersion: v1
data:
  config.yaml: |
    slaPeriod: 12h   # if this is manually updated to 6h, then drift detection runs
metadata:
********

the helmrelease also seems to be stuck with ProgressingWithRetry

  conditions:
  - lastTransitionTime: "2024-03-28T00:37:54Z"
    message: Helm upgrade succeeded for release flux-system/<chart_name>.v53 with chart <chart_name>@0.0.5
    observedGeneration: 21
    reason: ProgressingWithRetry
    status: "True"
    type: Reconciling
soumyabk commented 4 months ago

Another thing we noticed, using the annotation helm.toolkit.fluxcd.io/driftDetection: disabled does not skip the resource from being patched wrongly:

apiVersion: v1
data:
  config.yaml: '*** (after)'
kind: ConfigMap
metadata:
  annotations:
    helm.toolkit.fluxcd.io/driftDetection: disabled
    meta.helm.sh/release-name: <chart_name>
    meta.helm.sh/release-namespace: flux-system
  creationTimestamp: "2024-01-25T09:19:13Z"
souleb commented 4 months ago

There have been many changes in the underlying package that perform all the drift and apply between v0.37.2 and v0.37.4. Can you test this on the last version?

souleb commented 4 months ago

Another thing we noticed, using the annotation helm.toolkit.fluxcd.io/driftDetection: disabled does not skip the resource from being patched wrongly:

Are you setting it up manually or as part of the helmChart?

soumyabk commented 4 months ago

There have been many changes in the underlying package that perform all the drift and apply between v0.37.2 and v0.37.4. Can you test this on the last version?

Seeing the problem even after upgrading to v0.37.4

✗ flux version
flux: v2.2.3
helm-controller: v0.37.4
✗ kubectl -n <namespace> get configmaps <service-name>-config -o=jsonpath='{.data}'
{"config.yaml":"*** (after)"}%
soumyabk commented 4 months ago

Another thing we noticed, using the annotation helm.toolkit.fluxcd.io/driftDetection: disabled does not skip the resource from being patched wrongly:

Are you setting it up manually or as part of the helmChart?

we set the annotation on the resource manually, but we see the annotation even after helm reconciles it with the wrong patch

✗ kubeclt -n  <namespace> get configmaps <chart-name>-config -o yaml

apiVersion: v1
data:
  config.yaml: '*** (after)'
kind: ConfigMap
metadata:
  annotations:
    helm.toolkit.fluxcd.io/driftDetection: disabled
    meta.helm.sh/release-name: <chart-name>
    meta.helm.sh/release-namespace: flux-system
  creationTimestamp: "2024-01-25T09:19:13Z"
  labels:
    app: <chart-name>
    app.kubernetes.io/managed-by: Helm
    helm.toolkit.fluxcd.io/name: <chart-name>
    helm.toolkit.fluxcd.io/namespace: flux-system
  name: <chart-name>-config
  namespace: <chart-name>-system
  resourceVersion: "912413821"
  uid: 496b9d65-919b-4f21-a4e3-58070514ea73

We also tried using the driftDetection.ignore on the helmrelease and that does not seem to skip the data section of the configmap as well:

  driftDetection:
    ignore:
    - paths:
      - data/
      target:
        kind: configmap
        name: <chart-name>-config
    mode: enabled

I tried adding the following postRenderer on the helmrelease and did not see it show up on the configmap object

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  annotations:
    reconcile.fluxcd.io/forceAt: "2024-03-27T15:46:07.153497-07:00"
    reconcile.fluxcd.io/requestedAt: "2024-03-27T15:46:07.153497-07:00"
  creationTimestamp: "2024-01-25T09:19:04Z"
  finalizers:
  - finalizers.fluxcd.io
  generation: 26
  labels:
    kustomize.toolkit.fluxcd.io/name: <kustomization-name>
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: <chart-name>
  namespace: flux-system
  resourceVersion: "914078430"
  uid: 7c7f0fd9-1359-457c-b304-7d604ba1a98a
spec:
  chart:
    spec:
      chart: <chart-name>
      interval: 5m
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: HelmRepository
        name: helm-repo
        namespace: flux-system
      version: 0.0.5
  driftDetection:
    mode: enabled
  install:
    crds: CreateReplace
  interval: 5m
  maxHistory: 3
  postRenderers:
  - kustomize:
      patches:
      - patch: "- op: add\n  path: /metadata/annotations/helm.toolkit.fluxcd.io~1driftDetection\n
          \ value: disabled              \n"
        target:
          kind: configmap
          name: <chart-name>-config
          namespace: <chart-name>-system
          version: v1
  releaseName: <chart-name>
  upgrade:
    crds: CreateReplace
  valuesFrom:
  - kind: ConfigMap
    name: zone-info
soumyabk commented 4 months ago

Also flux reconcile helmrelease --force does not seem to be able to fix the helmrelease once its stuck in ProgressingWithRetry status.

soumyabk commented 3 months ago

thanks @souleb. When is the next version of the helm-controller release expected, so we can pick up this fix.

stefanprodan commented 3 months ago

@soumyabk see https://github.com/fluxcd/flux2/issues/4712

soumyabk commented 3 months ago

Also, can you confirm what is the right way to get helm-reconcile to ignore specific resources. I had tried adding this annotation helm.toolkit.fluxcd.io/driftDetection: disabled and also using the driftDetection.ignore field in the helmrelease and in both situations, the drift detection tried to patch the resource

  driftDetection:
    ignore:
    - paths:
      - data/
      target:
        kind: configmap
        name: <chart-name>-config
    mode: enabled
stefanprodan commented 3 months ago

It's kind: ConfigMap

stefanprodan commented 3 months ago

Also json paths must begin with / instead of ending with it. See here the docs: https://fluxcd.io/flux/components/helm/helmreleases/#ignore-rules

Should be:

  driftDetection:
    ignore:
      - paths: ["/data"]
        target:
          kind: ConfigMap
soumyabk commented 3 months ago

thanks that worked. Also, what is the recommended way to use this label helm.toolkit.fluxcd.io/driftDetection: disabled - can it be patched manually on some resource we want to temporarily disable on a cluster to test/debug something or does it need to come as a kustomization patch.

stefanprodan commented 3 months ago

@soumyabk you’ll be able to annotate objects in-cluster after Flux 2.3 release.