Open leandreArturia opened 2 months ago
Based on the status of the PodVolumeRestore, it appears that the PodVolumeRestore was not processed by the node agent:
status:
progress: {}
To troubleshoot this issue, please check the overall state of your cluster. Specifically, you may want to verify:
Reviewing these cluster conditions should help you identify the root cause and resolve the issue.
The node agent is correctly installed and running as daemonset on every node of the cluster.
The pod is restored but in pending state with the error :
0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling..
And my PVC is in a pending state also with the error (don't check the uid of the pvc, it's a test to restore a little application) :
failed to provision volume with StorageClass "longhorn": rpc error: code = Internal desc = Bad response statusCode [500]. Status [500 Internal Server Error]. Body: [message=unable to create volume: unable to create volume pvc-b2e25434-d8fc-45da-b709-2b2c6055d235: failed to verify data source: volume.longhorn.io "pvc-ce514251-9e56-4999-9e2d-bffae4ceed16" not found, code=Server Error, detail=] from [http://longhorn-backend:9500/v1/volumes]
After a little search, I did find an issue on longhorn that is relevant in my case, because the PVC has its datasource set to the volumesnapshot for the restore : https://github.com/longhorn/longhorn/issues/4083
I think my PVC (that is linked to the pod to restore) is created but not restored correctly :
Don't hesitate if you see that I'm on the wrong track.
I will update longhorn and update this issue when it's done.
I'm confused about this issue.
deployNodeAgent
, so I suppose the node-agent DaemonSet should not be installed with the Helm Chart.Then please check whether the backed-up volume's mounting pod is annotated as backup.velero.io/backup-volumes=<volume-name>
.
And please check the uploader of the filesystem backup is Restic or Kopia. The PVB says it uses Kopia, but the restore says it uses Restic.
Another thing worth notice is the Velero client and server version mismatch. Please align them.
Client:
Version: v1.9.1
Git commit: e4c84b7b3d603ba646364d5571c69a6443719bf2
Server:
Version: v1.13.0
And please use the v1.13.2 version of Velero. It has some bug fixes.
What steps did you take and what happened:
Installed Velero 1.13.0 with CSI snapshot via Helm (
vmware-tanzu/velero --version 6.0.0
) . Here is my configuration :I have a minIO deployed with self-signed certificate.
I have a backup with some resources and a pvc backed up with volumesnapshots (a pretty large PVC : 160GB with ~117GB used) :
The podvolumebackup :
When I try to restore this backup, I get a timeout after ~4 hours.
Output of
kubectl -n velero get podvolumerestores -l velero.io/restore-name=jenkins-rd-backup-manual-20240422144640 -o yaml
:The progress is never updated.
And I get a timeout error in the logs :
The volume is created but empty.
Logs of the restore and
velero debug --restore jenkins-rd-backup-manual-20240422144640
:bundle-2024-04-24-15-22-16.tar.gz
velero_restore_logs.txt
What did you expect to happen:
To get a full restore.
The following information will help us better understand what's going on:
Environment:
velero version
): Server 1.13.0 Client 1.9.1velero client config get features
): Nonekubectl version
):Server Version: v1.26.9+rke2r1
/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.