vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.73k stars 1.41k forks source link

Data Mover - preserve local snapshot when snapshot data movement is enabled #6240

Open Lyndon-Li opened 1 year ago

Lyndon-Li commented 1 year ago

Since local snapshot restoration is faster than data movement, it is a useful feature to preserve several local snapshots when snapshot data movement is enabled. The count of the preservation should be configurable by users. Moreover, if the data movement fails, the local snapshot may also be kept.

There will be two configurable options:

There are some details to cover in order to manage the local snapshots together with the remote data movement data, let's see how we can implement this.

sseago commented 1 year ago

We should make sure that if we want to keep last X local snapshot that "0" is an allowed value for X. Some users would rather not keep any artifacts in-cluster for things that are fully restorable from backup, since this increases storage costs.

Lyndon-Li commented 1 year ago

@sseago Sure, will do this.

For most of the on-premise storages, preserving local snapshots is either impossible or too costing. This requirement is primarily from the users who are running clusters in a public cloud and who also want to move data across cloud providers or who require some advanced data retention/management that the local snapshot cannot provide.

The major benefit to keep the local snapshots is that the restore is much faster from local snapshots.

Regarding to the cost for this kind of single-public-cloud-provider usage, the cost for local snapshot vs. data mover + simple object store is an interesting topic:

reasonerjt commented 1 year ago

We should make sure that if we want to keep last X local snapshot that "0" is an allowed value for X. Some users would rather not keep any artifacts in-cluster for things that are fully restorable from backup, since this increases storage costs.

This is the current behavior of v1.12.0

How to expose the setting, i.e. is it per backup or per {PV, storageclass...} involves the decision on #6638

But we may start a design without touching exposing the setting, for example PoC reading the number for local snapshot from environment variable.