kubernetes-csi / external-snapshotter

Sidecar container that watches Kubernetes Snapshot CRD objects and triggers CreateSnapshot/DeleteSnapshot against a CSI endpoint.
Apache License 2.0
463 stars 361 forks source link

VolumeGroupSnapshot across multiple CSI Drivers #1096

Closed leonardoce closed 1 month ago

leonardoce commented 1 month ago

What happened:

By mistake, a user could create a VolumeGroupSnapshot spanning volumes across multiple CSI Drivers. In this example, there are three PVCs:

All three PVCs are matching the label cnpg.io/instanceName-cluster-example-1. The VolumeGroupSnapshotClass is pointing to the hostpath.csi.k8s.io CSI Driver:

apiVersion: groupsnapshot.storage.k8s.io/v1alpha1
deletionPolicy: Delete
driver: hostpath.csi.k8s.io
kind: VolumeGroupSnapshotClass
metadata:
  name: csi-hostpath-groupsnapclass

When dynamically creating a VolumeGroupSnapshot across the previous three PVCs, I get this error:

apiVersion: groupsnapshot.storage.k8s.io/v1alpha1
kind: VolumeGroupSnapshot
metadata:
  name: new-groupsnapshot-demo
spec:
  source:
    selector:
      matchLabels:
        cnpg.io/instanceName: cluster-example-1
  volumeGroupSnapshotClassName: csi-hostpath-groupsnapclass
status:
  boundVolumeGroupSnapshotContentName: groupsnapcontent-0645712e-6d43-4c7a-a216-68d1d440d970
  error:
    message: 'Failed to check and update group snapshot content: failed to take group
      snapshot of the volumes [183a3236-1bfd-11ef-9ae0-4693735c2854 1844dc96-1bfd-11ef-9ae0-4693735c2854
      95a1644f-18de-11ef-92dc-ea4f26231554]: "rpc error: code = NotFound desc = volume
      id 95a1644f-18de-11ef-92dc-ea4f26231554 does not exist in the volumes list"'
    time: "2024-05-27T08:24:51Z"
  readyToUse: false

The VolumeGroupSnapshot creation is triggered against the CSI driver described in the VolumeGroupSnapshotClass, but the HostPath CSI driver is unaware of the VolumeHandle created by the other CSI Driver. This is why we get this error.

What you expected to happen:

I expected the VolumeGroupSnapshot to fail with a clear error without reaching the CSI driver. It should not be possible to create a VolumeGroupSnapshot across different CSI Drivers.