fluxcd / helm-controller

The GitOps Toolkit Helm reconciler, for declarative Helming
https://fluxcd.io
Apache License 2.0
404 stars 161 forks source link

[BUG] Drift Detection can not be disabled for specific resources using annotations or labels #922

Closed fbuchmeier-abi closed 5 months ago

fbuchmeier-abi commented 5 months ago

Hi folks!

According to the documentation, drift detection and correction can be disabled for specific resources by setting the following annotation on the resource:

metadata:
  annotations:
    helm.toolkit.fluxcd.io/driftDetection: disabled

In my tests, this has not been working. Resources annotated with this value still get corrected after a drift was added to the resources.

Since Flagger relies on this annotation to stop the helm-controller from reconciling resources owned by Flagger (e.g. services and their selector, this makes driftDetection impossible to use with any Flagger Canary deployments.

Using driftDetection in mode: enabled with Flagger results in both controllers (Helm & Flagger) updating the same object, from the service.spec.selector defined in the Helm Release to the service.spec.selector defined by Flagger canaries:

apiVersion: v1
kind: Service
  selector:
    app.kubernetes.io/instance: myapp-primary
apiVersion: v1
kind: Service
  selector:
    app.kubernetes.io/chart: mychart
    app.kubernetes.io/instance: myapp
    app_name: myapp

As you can imagine this constant change in the pod selector is a big problem when either the canary deployment is scaled to zero (which it normally is) or when there are different versions currently deployed as canary and primary.

I've traced the problem to the use of ListOptions in the Helm Controller: https://github.com/fluxcd/helm-controller/blob/main/internal/action/diff.go#L102

    // Base configuration for the diffing of the object.
    diffOpts := []jsondiff.ListOption{
        jsondiff.FieldOwner(fieldOwner),
        jsondiff.ExclusionSelector{v2.DriftDetectionMetadataKey: v2.DriftDetectionDisabledValue},
        jsondiff.Rationalize(true),
        jsondiff.Graceful(true),
    }

The ExclusionSelector includes all key-value pairs that for resources that should be added to the exclusion list.

This is implemented for ResourceOptions here: https://github.com/fluxcd/pkg/blob/main/ssa/jsondiff/options.go#L40

// ResourceOptions holds options for the server-side apply diff operation.
type ResourceOptions struct {
...
    // IgnorePaths is a list of JSON pointers to ignore when comparing objects.
    IgnorePaths []string
    // ExclusionSelector is a map of annotations or labels which mark a
    // resource to be excluded from the server-side apply diff.
    ExclusionSelector map[string]string
    // MaskSecrets enables masking of Kubernetes Secrets in the diff.
    MaskSecrets bool
...
}

However, ListOptions currently do not support the ExclusionSelector as far as I can see: https://github.com/fluxcd/pkg/blob/main/ssa/jsondiff/options.go#L57

// ListOptions holds options for the server-side apply diff operation.
type ListOptions struct {
    // IgnoreRules is a list of rules that specify which paths to ignore
    // for which resources.
    IgnoreRules []IgnoreRule
    // Graceful enables graceful handling of errors during a server-side
    // apply diff operation. If enabled, the diff operation will continue
    // even if an error occurs for a single resource.
    Graceful bool
}

What would be required to add this logic also to ListOptions, or am I missing something?

Thank you and best regards,

Florian.