Closed Naveen-Kamagani closed 3 weeks ago
Can anyone help on this issue ?
It looks like you have almost 900 snapshots to be taken. While most of the snapshot+datamover work can be done in parallel (spread across the nodes that the associated pods are running on), there is some initial time when starting to take the snapshots that must be done synchronously which takes approx 7-10 seconds. This is the bulk of your 2 hours.
We are working on a feature for the future which will allow the entire snapshot/pv backup process to happen in parallel via several controller threads, but that is not available today. Once that enhancement is implemented, you will see a significant reduction in backup times for this use case.
@sseago Backups are triggered using the Velero schedule. How do we know the snapshots triggered daily are incremental by default?
@Naveen-Kamagani Whether a CSI snapshot is incremental is determined by the CSI driver, not by velero. If you're using data movement (although from the above backup configuration, it looks like you are not), the storage of backup content in the Backup Storage Location will always be done incrementally if there is a prior backup for that volume.
@Naveen-Kamagani but note that even with incremental backups, the 7-10 seconds at the beginning of each before we can move on to the next will still be there. So the 2 hour time for this backup will only be improved when we get the parallel backup design implemeneted.
@sseago Is there a design in place to trigger to parallel snapshot backup of EBS volumes by CSI driver, could you please let us know how many months it will be to implement this feature?
@Naveen-Kamagani There is an open design PR for this: https://github.com/vmware-tanzu/velero/issues/7474
My current expectation is that Phase 1 will be implemented in Velero 1.15, and Phase 2 (which actually puts parallel item backup in place) in Velero 1.16. Velero 1.14 was just released today, so we're talking two releases in the future. There's not current release date for 1.16, but I imagine it will be during the first half of 2025 at some point. Maybe the first quarter, but that's not certain.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands.
This issue was closed because it has been stalled for 14 days with no activity.
We use the OADP operator to backup Kubernetes resources and EBS volumes using the CSI snapshot feature. We have created a scheduler to trigger backups daily. But each backup is running nearly 2 hours. Is there any way we can reduce the backup running time to nearly 30 minutes or within 1 hour? Is there anyway to trigger to CSI Snapshots parallel and complete them fast so that backup runs faster. We are looking for different options.
DataProtectionApplication -
Scheduled Backup -
VolumeSnapshotLocation -
VolumeSnapshotClass -
BackupStorageLocation -
We do not want to use restic. Please suggest a solution to increase the efficiency because there is another cluster where 300 namespaces and each namespace will have 12 volumes. In the shared example we have 40 namespaces and it is running for nearly 4 hours, if we have to take backup for 300 namespaces cluster and backup is running nearly 20 hours.