vmware-tanzu / velero

Backup and migrate Kubernetes applications and their persistent volumes
https://velero.io
Apache License 2.0
8.79k stars 1.41k forks source link

fs-backup cleanup hangs when failing #8394

Closed Lyndon-Li closed 1 week ago

Lyndon-Li commented 1 week ago

Velero with default node-agent concurrency (1) Create a PVB When the PVB is in InProgress status, find some way to fail the PVB (don't restart node-agent) Run another fs-backup or data mover The other fs-backup or data mover fails to start and the log complainsData path instance is concurrent limited requeue later

The problem is with the failing PVB, the cleanup hangs with below logs:

time="2024-10-28T10:46:46Z" level=error msg="Async fs backup data path failed" controller=PodVolumeBackup error="Failed to ...
time="2024-10-28T10:46:46Z" level=info msg="Action finished" backup=velero/test14 controller=podvolumebackup logSource="pkg/uploader/provider/kopia.go:91" podvolumebackup=velero/test14-n2nlw
time="2024-10-28T10:46:46Z" level=info msg="Closing FileSystemBR" backup=velero/test14 controller=podvolumebackup logSource="pkg/datapath/file_system.go:145" podvolumebackup=velero/test14-n2nlw user=test14-n2nlw
<no more logs for this VGDP thread>