rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.5k stars 219 forks source link

Bundle containing PersistentVolumeClaim that uses dynamic provisioning is broken due to changing volumeName #1159

Open DillonN opened 1 year ago

DillonN commented 1 year ago

Is there an existing issue for this?

Current Behavior

After the volume is dynamically provisioned, any future changes to the bundle will throw result in an error because fleet keeps trying to replace the generated volumeName with empty string:

failed to replace object: PersistentVolumeClaim "<pvc-name>" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
core.PersistentVolumeClaimSpec{
AccessModes: {"ReadWriteOnce"},
Selector: nil,
Resources: {Requests: {s"storage": {i: {...}, s: "50Mi", Format: "BinarySI"}}},
- VolumeName: "pvc-e05dd9ef-af4e-4959-9b07-7ed903d9fc75",
+ VolumeName: "",
StorageClassName: &"harvester",
VolumeMode: &"Filesystem",
... // 2 identical fields
}

I tried following this guide and added this diff entry (under bundle spec) but it did nothing:

  diff:
    comparePatches:
    - apiVersion: v1
      kind: PersistentVolumeClaim
      name: <pvc-name>
      operations:
      - op: remove
        path: /spec/volumeName

Expected Behavior

Fleet supports dynamically provisioned volumes, maybe with a diff workaround. But I've searched everywhere and can't find any mention of anyone using fleet with dynamic volumes. Even the force option does not help.

Steps To Reproduce

  1. Create a bundle with a dynamically provisioned PersistentVolumeClaim
  2. Wait until claim is fulfilled with a volume
  3. Try updating the bundle
  4. See that error is thrown and bundle never updates again

Environment

- Architecture: amd64
- Fleet Version: 0.5.0
- Cluster:
  - Provider: RKE2 on Rancher/Harvester stack
  - Options: 3 etcd+control nodes, 2 workers, using Harvester's storage provider
  - Kubernetes Version: 1.24.7+rke2r1

Logs

No response

Anything else?

The only workaround I've found is to copy the generated PV name into the bundle git, but I don't think that's a sufficient long-term solution.

snovak7 commented 1 year ago

I think I stumbled on this one too

Defining a PersistentVolumeClaim... then when pod dies it deletes my PVC, after some time it generates new one, with a clean one from beginning... empty, and the cycle continues.

This is only issue with yaml way (Kustomization). I haven't seen troubles with helm charts.

For now I manually generate PVCs as a workaround, other would be to write helm charts.

jrdelarosa8 commented 2 months ago

I believe we are running into this issue as well.

We are creating a resource (type: AWSCluster) using Cluster API. Cluster API creates several resources, one of which is a network load balancer. Once the NLB is created, the object is patched with the .spec.controlPlaneEndpoint object.

It's at this point that Fleet attempts to overwrite the resource with the default (zero) values. Before finding this issue I tried what @DillonN did, which was to use the diff configuration like the one included below.

Unfortunately this does not prevent Fleet from trying to overwrite the resource and my bundle stays in the ErrApplied state.

diff:
  comparePatches:
    - kind: AWSCluster
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
      namespace: <namespace>
      name: <resource name>
      operations:
      - {"op":"remove", "path":"/spec/controlPlaneEndpoint"}
      jsonPointers:
      - "/spec/controlPlaneEndpoint"