rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.52k stars 229 forks source link

[v0.10] Backport of "helm/resource-policy" was set to keep when installing certain helm charts via fleet, for unknown reason. #2860

Closed manno closed 1 month ago

manno commented 1 month ago

Backport of #2716

Is there an existing issue for this?

Current Behavior

When using fleet to deploy the following fleet charts, workload resources are annotated with helm/resource-policy: keep, but we could not figure out why. By searching the source code of their helm charts, we can only see the crds are annotated with such setting. app helm repo helm charts GitHub repository
argo workflows https://argoproj.github.io/argo-helm argo-workflows:0.41.14 https://github.com/argoproj/argo-helm/tree/main/charts/argo-workflows
cert-manager https://charts.jetstack.io cert-manager:v1.15.2 https://github.com/cert-manager/cert-manager/tree/master/deploy/charts/cert-manager
bitnami/cert-manager oci://registry-1.docker.io/bitnamicharts/cert-manager :1.3.16 https://github.com/bitnami/charts/blob/main/bitnami/cert-manager/

Using argo-workflows:0.41.14 as an example, the only resources annotated with helm.sh/resource-policy: keep are those under /templates/crds: https://github.com/search?q=repo%3Aargoproj%2Fargo-helm+path%3A%2Fcharts%2Fargo-workflows%2F**+%22resource-policy%22&type=code

But when installing via fleet config:

defaultNamespace: argo

helm:
  # Use a custom location for the Helm chart. This can refer to any go-getter URL.
  # This allows one to download charts from most any location.  Also know that
  # go-getter URL supports adding a digest to validate the download. If repo
  # is set below this field is the name of the chart to lookup
  chart: argo-workflows

  repo: https://argoproj.github.io/argo-helm

  version: 0.41.14
  releaseName: argo-workflows
  values:
    fullnameOverride: argo-workflows
    server:
      extraArgs: [--auth-mode=server]
    images:
      pullPolicy: IfNotPresent
    crds:
      keep: false # avoid argo conflicts between different gitrepo due to ownership annotation

The resulted deployments are annotated with helm.sh/resource-policy set to keep

metadata:
  name: argo-workflows-server
  annotations:
    deployment.kubernetes.io/revision: '1'
    helm.sh/resource-policy: keep
    meta.helm.sh/release-name: argo-workflows
    meta.helm.sh/release-namespace: argo
    objectset.rio.cattle.io/id: default-resource-policy-bug-app-argo
#    key: string
  creationTimestamp: '2024-08-08T15:03:50Z'
  generation: 1
  labels:
    app: server
    app.kubernetes.io/component: server
    app.kubernetes.io/instance: argo-workflows
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: argo-workflows-server
    app.kubernetes.io/part-of: argo-workflows
    app.kubernetes.io/version: v3.5.10
    helm.sh/chart: argo-workflows-0.41.14
    objectset.rio.cattle.io/hash: 9755058833ea56cf90586e04fe3b080a49eca171
#    key: string
  namespace: argo
  resourceVersion: '8481'
  uid: ca40a5bb-5d94-4c5c-88b1-06e68bc6b8b9
  fields:
    - argo-workflows-server
    - 1/1
    - 1
    - 1
    - 20s
    - argo-server
    - quay.io/argoproj/argocli:v3.5.10
    - >-
      app.kubernetes.io/instance=argo-workflows,app.kubernetes.io/name=argo-workflows-server

Expected Behavior

The workloads are not annotated with helm.sh/resource-policy: keep unless it is defined in helm charts.

Steps To Reproduce

The sample fleet yaml config can be located at https://github.com/aDisplayName/bugsamplecode/tree/main/20240808

Environment

- Architecture:
- Fleet Version: 0.9.6
- Rancher version: 2.8.4
- Cluster:
  - Provider: k3s
  - Kubernetes Version: 1.21.14, 1.28.12

Logs

No response

Anything else?

No response

manno commented 1 month ago

Here is the automated test https://github.com/rancher/fleet/pull/2875/files#diff-280ae881ae2a07de3d590310f02ebc5eb83d0c55da9b34a68647b6513387fbf1

sbulage commented 1 month ago

QA Observations:

Rancher Version: 2.9-head and Fleet version: 0.10.4-rc.1

Screenshot showing actual result.

Annotation on one of the CRD. ![Screenshot from 2024-10-07 21-16-03](https://github.com/user-attachments/assets/749973a6-4e5e-440f-9247-f39ada8265df)
manno commented 1 month ago

The annotation doesn't get removed after adding or removing {crds: keep: true/false}.

I agree, our code doesn't seem able to remove that policy. However, the example you picked has the policy already in their template folder: https://github.com/argoproj/argo-helm/blob/main/charts/argo-workflows/templates/crds/argoproj.io_cronworkflows.yaml#L7-L9

So adding it from fleet does nothing and deletion isn't implemented. Do we want to delete existing "keep" policies from the charts template, if deleteCRD is true? I think yes?

sbulage commented 1 month ago

QA Observations:

Followed below scenario's considering the deleteCRDResources values and annotation behavior.

Scenario 1


Scenario 2

Conclusion:

If we don't want to see annotation helm.sh/resource-policy: keep then we have to use the deleteCRDResource option with true value in fleet.yaml.


Screenshots:

Annotation is not present on NonCRD resources(ConfigMaps, Deployment, Service Account) and CRD. ![Screenshot from 2024-10-09 22-33-01](https://github.com/user-attachments/assets/e1059f39-536c-4376-9d88-deabb4bf502c)