Closed danfengliu closed 8 months ago
I agree we should do better in this scenario. There are some similar cases to this. Let's settle on a unified solution to all of them. IMO, we should consider the k8s resources as crucial for the Velero, such as Pod, PV, and PVC.
First, the potential errors should be converted to warnings. Second, need to consider whether the volumes should be tracked by the skipped PV trackers.
hi team, thanks for creating this backup/restore tooling. Unfortunately, we encountered this issue when our serverless applications relied on an RWX mode PVC. We choose to do a backup at midnight, where the traffic is low and it also keeps the running pod minimized to zero. But, the backup didn't cover our PVCs for serverless :(
@hsinhoyeh Could you give more information about your scenario? Do you know if you use the Filesystem or volume snapshot backup? Is the PVC mounted by multiple pods when the backup is in progress?
@hsinhoyeh Could you give more information about your scenario? Do you know if you use the Filesystem or volume snapshot backup? Is the PVC mounted by multiple pods when the backup is in progress?
Hi @blackpiglet we use file system for backup. the PVC is supposed to be mounted by multiple pods (with mode: RWM). having say that, our multiple pods are mostly read from the PVC (during backup), not writing it.
@hsinhoyeh Thanks for the feedback. Could you share the backup command or the backup CR YAML?
If there is no pod mounting the PVC when a backup is ongoing, the file-system backup cannot cover the PVC, because the file-system uploader needs to read the PVC's volume data by the mounting directory for the pod on the k8s node. Please read the PodVolumeBackup description to understand how it works: https://velero.io/docs/v1.13/file-system-backup/#custom-resource-and-controllers.
For your scenario, if the PVC's volume supports the snapshot function, then we can use snapshot to back up the data.
This is working as expected I don't think we wanna change the error into a warning, which will be a breakchange.
Is there any way to exclude PVCs with unbound PVs? As long as this is not the case I would see this as as a Warning and not as Error.
Describe the problem/challenge you have
Backup namespace contains PVCs which not in used by any pod, then got PartiallyFailed result.
Describe the solution you'd like Warning should be enough to let user notice this workload might have issue or not.
Anything else you would like to add:
Environment:
velero version
):kubectl version
):/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.