vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.79k stars 1.41k forks source link

Large PVCs are canceled #8454

Open filipe-silva-magalhaes-alb opened 17 hours ago

filipe-silva-magalhaes-alb commented 17 hours ago

What steps did you take and what happened: The data uploads of the largest pvcs failed.

velero create backup velero-schedule-s3-20241125000006 --resource-policies-configmap velero-efs-resourcepolicy --snapshot-move-data

kubectl get configmap cm -n velero velero-efs-resourcepolicy -o yaml

apiVersion: v1
data:
  efs-resourcepolicy.yaml: |
    version: v1
    volumePolicies:
    - conditions:
        csi:
          driver: efs.csi.aws.com
      action:
        type: skip
kind: ConfigMap
metadata:
  name: velero-efs-resourcepolicy
  namespace: velero

What did you expect to happen: Backup runs without problems.

The following information will help us better understand what's going on:

velero debug --backup velero-schedule-s3-20241125000006 bundle-2024-11-25-14-30-47.tar.gz

Parameters of backup:

csiSnapshotTimeout: 10m0s
itemOperationTimeout: 6h0m0s
uploaderConfig:
  parallelFilesUpload: 2

Parameters of daemonset (running in privileged mode):

  - --features=EnableCSI 
  - --data-mover-prepare-timeout=190m 

Anything else you would like to add:

Environment:

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

Lyndon-Li commented 5 hours ago

By default, Velero's data mover backup has a 4 hour timeout for each volume. If that is not enough, you could config default-item-operation-timeout from the Velero server parameter. Meanwhile, if you want to accelerate the backup especially to the large/complex volumes, you could config uploader concurrency through the parallel-files-upload backup flag. By default, it is the number of CPU cores of the node where the data mover backup is running