NetApp / trident

Storage orchestrator for containers
Apache License 2.0
762 stars 222 forks source link

Trident fail to delete child FlexClone of a FlexClone #878

Open rabin-io opened 11 months ago

rabin-io commented 11 months ago

Describe the bug When using trident as backend for virtual machines with kubevirt, if one restore a volume of a VM and later on delete the VM, we are left with FlexClone without it parent, which require manual intervention with cli to resolve it

Environment

To Reproduce

  1. On a clean start, we have a clean OCP install with AWS FSx as the backend
  2. Installing CNV/kubevirt will create 6 volumes for the boot source of the default templates
  3. Create a VM from a template → this will create a FlexClone from the source Volume of the "golden image"
  4. Create a VM snapshot
  5. Restore the snapshot → this will create a new FlexClone from the FlexClone on step 3. (On this stage we see the FlexClone from step 3 was deleted)
  6. Delete the VM → this deletes the VM and the PV/C for the 2nd FlexClone what we see in the backend is that the last FlexClone is left in offline state. And can't be deleted without doing a split.

When using this as part of our testing of OCP on AWS with FSx, we see this behavier, and later on this block the deprovision of the FSx storage, as you can't delete the volumes from AWS.

Expected behavior All resources should be deleted when the VM is deleted.

Additional context

akalenyu commented 11 months ago

The issue seems to be reducible to the following sequence of actions (strictly on k8s entities):

$ oc create -f pvc.yaml 
persistentvolumeclaim/simple-pvc created
$ oc get pvc
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
simple-pvc   Bound    pvc-acefa31e-61f4-4bef-9e82-daf30a4d85c0   1Gi        RWX            trident-csi-fsx   3s
$ oc create -f snap.yaml 
volumesnapshot.snapshot.storage.k8s.io/snapshot created
$ oc get volumesnapshot
NAME       READYTOUSE   SOURCEPVC    SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS   SNAPSHOTCONTENT                                    CREATIONTIME   AGE
snapshot   true         simple-pvc                           296Ki         csi-snapclass   snapcontent-7d00e584-5ce6-40f2-b1f0-40f254845e3d   3s             3s
$ oc delete pvc simple-pvc 
persistentvolumeclaim "simple-pvc" deleted
$ oc create -f restore.yaml 
persistentvolumeclaim/restore-pvc-1 created
$ oc get pvc
NAME            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
restore-pvc-1   Bound    pvc-4d92a2ea-02a7-404d-9f9d-054c7dd8361b   1Gi        RWX            trident-csi-fsx   2s
$ oc delete pvc restore-pvc-1 
persistentvolumeclaim "restore-pvc-1" deleted
$ oc delete volumesnapshot snapshot 
volumesnapshot.snapshot.storage.k8s.io "snapshot" deleted
# Doesn't converge

Where the manifests are simply


apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: simple-pvc
spec:
  storageClassName: trident-csi-fsx
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: snapshot
spec:
  volumeSnapshotClassName: csi-snapclass
  source:
    persistentVolumeClaimName: simple-pvc

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restore-pvc-1
spec:
  dataSource:
    name: snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  storageClassName: trident-csi-fsx
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
uppuluri123 commented 8 months ago

Following this as an issue to investigate.

akalenyu commented 7 months ago

If anyone is interested in the reproducer in kubevirt terms (I expected the reduced reproducer to be of more interest here):

$ cat dv.yaml 
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: simple-dv
spec:
  source:
      registry:
        pullMethod: node
        url: docker://quay.io/kubevirt/fedora-with-test-tooling-container-disk:v0.53.2
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 8Gi
$ cat vm.yaml 
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: simple-vm
  namespace: default
spec:
  running: true
  template:
    metadata:
      labels: {kubevirt.io/domain: simple-vm,
        kubevirt.io/vm: simple-vm}
    spec:
      domain:
        devices:
          disks:
          - disk: {bus: virtio}
            name: dv-disk
          - disk: {bus: virtio}
            name: cloudinitdisk
        resources:
          requests: {memory: 2048M}
      volumes:
      - dataVolume: {name: simple-dv}
        name: dv-disk
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            password: fedora
            chpasswd: { expire: False }
        name: cloudinitdisk
$ cat vmsnap.yaml 
apiVersion: snapshot.kubevirt.io/v1alpha1
kind: VirtualMachineSnapshot
metadata:
  name: snap-larry
spec:
  source:
    apiGroup: kubevirt.io
    kind: VirtualMachine
    name: simple-vm
$ cat vmrestore.yaml 
apiVersion: snapshot.kubevirt.io/v1alpha1
kind: VirtualMachineRestore
metadata:
  name: restore-larry
spec:
  target:
    apiGroup: kubevirt.io
    kind: VirtualMachine
    name: simple-vm
  virtualMachineSnapshotName: snap-larry