vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.4k stars 1.37k forks source link

Implement shallow copy behavior for backup of CSI volumes with copy-on-restore behavior #7927

Closed msfrucht closed 3 days ago

msfrucht commented 3 days ago

Shallow copy backup implementation for supported provisioners

Some provisioners such as ceph-csi-cephfs and IBM Spectrum Scale have implemented "shallow-copy". Such that when the PVC is created from a VolumeSnapshot source and the accessModes have been set to ReadOnlyMany, the resulting PVC is a read-only reference to the internal storage of the snapshot rather than a new volume. This removes the performance and storage penalty of copy on write restore that is useful for inspection and backup behavior.

The behavior is triggered in supported CSI driver version by checking storage class parameters and modifying the access modes of the from-snapshot PVC accessModes to ReadOnlyMany.

This behavior is not without cost. Cephfs shallow PVCs cannot be snapshot. Since the shallow copy PVC will be removed after backup is complete this is not a implementation breaker. https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/cephfs-snapshot-shallow-ro-vol.md

Restrictions for IBM Spectrum Scale https://www.ibm.com/docs/en/scalecsi/2.11?topic=vs-create-shallow-copy-volume-from-source-snapshot-read-only Snapshot and shallow copy volume must be from file systems that belong to the same cluster. Restoring snapshot (shallow copy volume) to lightweight PVC is not supported. Restoring snapshot (shallow copy volume) across file systems from different IBM Storage Scale clusters is not supported. Restoring snapshot (shallow copy volume) between version 1 and version 2 storageClass (or between version 2 and version 1 storageClass) is not supported. Restoring snapshot (shallow copy volume) is not supported for static snapshots or snapshots that are made from versions previous to CSI 2.5. Volume expansion is not supported for shallow copy volume. Volume stat is not supported for shallow copy volume. Shallow copy volume is not supported for fsgroup and subdir. Shallow copy volume is not supported where Security-Enhanced Linux (SELinux) is enabled.

Signed-off-by: MICHAEL S FRUCHTMAN msfrucht@us.ibm.com

Issues Resolved

A number of storageclasses implement copy on restore behavior when creating a PVC from a VolumeSnapshot. In large file count PVCs this behavior is extremely costly.

In extreme cases, such as multi-million file count PVCs and large, this behavior is essential to prevent the backup from stalling for hours or days.

Fixes #(issue)

Please indicate you've done the following:

msfrucht commented 3 days ago

Not ready for delivery.