rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.52k stars 229 forks source link

[SURE-9137] ClusterValues dont apply changes if one of the clusters is missing the templateValues #2943

Open skanakal opened 1 month ago

skanakal commented 1 month ago

Is there an existing issue for this?

Current Behavior

If a GitRepo is configured to target two or more clusters and the fleet.yaml file includes${ .ClusterValues}, any missing templateValues in one of the cluster's spec will prevent updates or changes from being deployed to the clusters where templateValues are properly configured.

Expected Behavior

Steps To Reproduce

  1. Install rancher 2.9.2 with fleet 0.10.3v
  2. Register two downstream clusters, ensuring that one of them includes templateValues.
apiVersion: fleet.cattle.io/v1alpha1
kind: Cluster
metadata:
  annotations:
  labels:
    foo: bar
    management.cattle.io/cluster-display-name: rke2custom1
    management.cattle.io/cluster-name: c-m-qmc767s2
    objectset.rio.cattle.io/hash: 464bd091084175e4d5572051571f4dfb39bcf2fd
    provider.cattle.io: rke2
  name: rke2custom1
  namespace: fleet-default
spec:
  agentAffinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
              - key: fleet.cattle.io/agent
                operator: In
                values:
                  - 'true'
          weight: 1
  clientID: pl882vs458n4lqqrj8jc58jvkvq4xgqdfv9l7q7spnrhh7s8wjgj8v
  kubeConfigSecret: rke2custom1-kubeconfig
  kubeConfigSecretNamespace: fleet-default
  templateValues:
    generated:
      cluster_metadata:
        fqdn: server-1.example.com
        name: server-1
  1. create gitrepo from this example path: templateValues
  2. check the gitrepo dashboard for resourceReady

Environment

- Architecture: x86_64
- Fleet Version: fleet:104.0.3+up0.10.3
- Cluster:
  - Provider: custom
  - Options: 1
  - Kubernetes Version: v1.30.5+rke2r1

Logs

From the fleet-controller logs:

2024-10-08T11:59:49Z    DEBUG   bundle  Unchanged bundledeployment  {"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"mcc-rke2custom1-managed-system-upgrade-controller","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "mcc-rke2custom1-managed-system-upgrade-controller", "reconcileID": "04c5c324-f0f4-4f19-bc31-1e11a890da3e", "bundledeployment": {"apiVersion": "fleet.cattle.io/v1alpha1", "kind": "BundleDeployment", "namespace": "cluster-fleet-default-rke2custom1-43138de7906f", "name": "mcc-rke2custom1-managed-system-upgrade-controller"}, "operation": "unchanged"}
2024-10-08T11:59:49Z    DEBUG   bundle  Unchanged bundledeployment  {"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"fleet-agent-rke2custom1","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "fleet-agent-rke2custom1", "reconcileID": "d63cdb5d-544d-4356-b269-350b5564aa21", "bundledeployment": {"apiVersion": "fleet.cattle.io/v1alpha1", "kind": "BundleDeployment", "namespace": "cluster-fleet-default-rke2custom1-43138de7906f", "name": "fleet-agent-rke2custom1"}, "operation": "unchanged"}
2024-10-08T11:59:49Z    ERROR   Reconciler error    {"controller": "bundle", "controllerGroup": "fleet.cattle.io", "controllerKind": "Bundle", "Bundle": {"name":"templatevalues-templatevalues-5bfacaa9","namespace":"fleet-default"}, "namespace": "fleet-default", "name": "templatevalues-templatevalues-5bfacaa9", "reconcileID": "2a8aaea7-2194-46c2-a923-bf6f745b1a4a", "error": "failed to render helm values template: template: values:56:40: executing \"values\" at <.ClusterValues.generated.cluster_metadata.fqdn>: map has no entry for key \"generated\""}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:222

Anything else?

current behavior: image

manno commented 1 month ago

We should not fail all bundle deployments when one cluster is missing a label.