vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.42k stars 1.37k forks source link

Delta backups in velero #6321

Open manojbvn4u opened 1 year ago

manojbvn4u commented 1 year ago

Hi Team,

Does Velero support delta backup of PersistentVolume OR will it just take current state backup every time in a scheduled backup?

Ex:

If I have Mongo DB with 1 TB of data, does the backup need to perform 1TB of backup every time or can we take the backup only delta of new data added from the previous backup?

qiuming-best commented 1 year ago

@manojbvn4u Velero supports delta backup of PV, for file-system level backup (using Restic or Kopia), it will first check the backups in Repository to find parent backups. For the snapshot, I think the vendor generally supports the increment backup.

manojbvn4u commented 1 year ago

Hi @qiuming-best

Thanks for answering the question, just to clarify it, Let me present our use case, we have Mongo DB (with 1 Primary & 2 replicas) running on K8s we want to do daily backups.

  1. Can we do a delta backup daily and a full backup on a weekly basis
  2. Can we take backup on only one of replica PV, as I was not able to find filtering based on specific PV while taking backups

Is there a documentation on how to do both of the above

qiuming-best commented 1 year ago
  1. Maybe I think you needn't do the full backup.

    On Velero's side, it will back up all files in every backup(full backup). And the underlying backup mechanisms(Kopia, Restic, or vendor snapshot) will ensure data integrity. The underlying backup mechanisms will only consider changed files for the next backup run. Since all the unchanged files are already in the repository.

    But If you really want to do a full backup, you could create one new BSL with a new bucket, then do a backup using the new BSL, it will do one full backup.

  2. currently, We cannot backup only one of the replica PV, maybe we could support this in the future by the user specifying the Primary PV and replica PV, and only backup only one of them. currently, you can use the opt-in/opt-out way to filter the pod volume to backup or ignore

Lyndon-Li commented 1 year ago

Can we do a delta backup daily and a full backup on a weekly basis

This falls to the requirement to support GFS retention policies. At present, Velero doesn't support it. We need a sophisticated design for it since it is not only about how much data is read but also about how long a backup is preserved. Let me keep this issue open, and we will add this support in the future releases.