kubernetes-csi / external-snapshotter

Sidecar container that watches Kubernetes Snapshot CRD objects and triggers CreateSnapshot/DeleteSnapshot against a CSI endpoint.
Apache License 2.0
484 stars 369 forks source link

flag for not deleting individual snapshots when a volumegroupsnapshot is deleted #1126

Open jmccormick2001 opened 2 months ago

jmccormick2001 commented 2 months ago

Is your feature request related to a problem?/Why is this needed I notice that when I delete a volume group snapshot, and DeleteVolumeGroupSnapshot() is called, that the snapshotter calls ControllerDeleteSnapshot() for each snapshot belonging to the volume group. For some storage devices, removing a volume group causes the underlying/associated snapshots to be removed in an atomic fashion without having to remove each snapshot individually.

Describe the solution you'd like in detail I would like some way to turn off the default behavior of having calls to ControllerDeleteSnapshot() for each snapshot when DeleteVolumeGroupSnapshot() is invoked. This way, I can have our storage device be responsible for removing each associated snapshot.

Describe alternatives you've considered One alternative is to make our storage device APi call specify that we do not want the storage device to cleanup or remove the dependent snapshots.

Additional context

xing-yang commented 1 month ago

We need to discuss more about this as this would require a CSI spec change.

cc @bswartz

xing-yang commented 1 month ago

@jmccormick2001 Can you please provide logs? DeleteVolumeGroupSnapshot should not cause ControllerDeleteSnapshot() to really delete the individual snapshot. It should be skipped quickly. If not, there is a bug that we need to fix.

jmccormick2001 commented 1 month ago

sure, attached is the driver log that shows the calls to ControllerDeleteSnapshot(). Here is how I'm recreating this:

  1. create VolumeGroupSnapshotClass
  2. create 2 iscsi volumes/PVCs
  3. apply a label 'mygroup' to the PVCs
  4. create a VolumeGroupSnapshot that selects the 2 PVCs using the label

driver.log

That all works as expected, I see the snapshots on our storage device as planned. Then I delete the VolumeGroupSnapshot to trigger the deletion. The delete of the VolumeGroupSnapshot works as expected, but notice what is going on in the driver log, it calls ControllerDeleteSnapshot() twice, one for each PVC/volume.

I suspect this is by design, but we have a feature on the storage device that is a single call, that essentially says, delete the entire snap group in an atomic fashion instead of individually deleting snapshots one-by-one that are included in the group.

xing-yang commented 1 month ago

We are supposed to skip the CSI driver call to delete individual snapshots in this PR but I think we missed it: https://github.com/kubernetes-csi/external-snapshotter/pull/972. We can add a check in syncContent() not to call the CSI driver if VolumeSnapshotContent status has a VolumeGroupSnapshotHandle: https://github.com/kubernetes-csi/external-snapshotter/blob/v8.1.0/pkg/sidecar-controller/snapshot_controller.go#L63

https://github.com/kubernetes/enhancements/blob/d9fc500581dddf2debeb168d6bfef62a68c35a70/keps/sig-storage/3476-volume-group-snapshot/README.md#delete-volumegroupsnapshot

manishym commented 1 month ago

/assign

mowangdk commented 2 weeks ago

We also encountered this problem when we verified GroupVolumeSnapshot, looking forward this to be resolved~