Open gravops opened 3 months ago
I am getting issue like:
Errors:
Velero: name: /app message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=ns-testapp2, name=ebs-claim-test): rpc error: code = Unknown desc = failed to get volumesnapshotclass for storageclass default-storage-class: error getting volumesnapshotclass: failed to get volumesnapshotclass for provisioner ebs.csi.aws.com, ensure that the desired volumesnapshot class has the velero.io/csi-volumesnapshot-class label
Hello, have you configured VolumeSnapshotClass, example https://medium.com/linux-shots/backup-kubernetes-using-velero-and-csi-volume-snapshot-4155d4e32e5d
Hello @MoZadro , As suggested I created VolumeSnapshotClass, and still getting same issue. Do I need to create vsclass and volumesnapshot objects for each provisioner type as well?
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-aws-vsc
snapshotter: ebs.csi.aws.com
deletionPolicy: Retain
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: ebs-volume-snapshot
spec:
snapshotClassName: csi-aws-vsc
source:
name: ebs-claim
kind: PersistentVolumeClaim
Errors:
Velero: message: /Error listing resources error: /the server could not find the requested resource
name: /app message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=ns-testapp2-10002-dev, name=ebs-claim-test): rpc error: code = Unknown desc = failed to get volumesnapshotclass for storageclass default-storage-class: error getting volumesnapshotclass: failed to get volumesnapshotclass for provisioner ebs.csi.aws.com, ensure that the desired volumesnapshot class has the velero.io/csi-volumesnapshot-class label
name: /ebs-claim message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=ns-testapp2-10002-dev, name=ebs-claim): rpc error: code = Unknown desc = PVC ns-testapp2-10002-dev/ebs-claim has no volume backing this claim
name: /efs-claim message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=ns-testapp2-10002-dev, name=efs-claim): rpc error: code = Unknown desc = PVC ns-testapp2-10002-dev/efs-claim has no volume backing this claim
name: /prometheus-prometheus-operator-prometheus-db-prometheus-prometheus-operator-prometheus-0 message: /Error backing up item error: /error executing custom action (groupResource=persistentvolumeclaims, namespace=prometheus, name=prometheus-prometheus-operator-prometheus-db-prometheus-prometheus-operator-prometheus-0): rpc error: code = Unknown desc = failed to get volumesnapshotclass for storageclass default-storage-class: error getting volumesnapshotclass: failed to get volumesnapshotclass for provisioner ebs.csi.aws.com, ensure that the desired volumesnapshot class has the velero.io/csi-volumesnapshot-class label
Cluster: resource: /volumesnapshotclasses message: /Error listing items error: /the server could not find the requested resource
You need to create VolumeSnapshotClass with similar parameters as your storageClass. Of course to create a CSI-snapshot of a PVC you need Volume Snapshot Class.
For example my storageClass looks like this:
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-encryption
parameters:
csi.storage.k8s.io/node-publish-secret-name: longhorn-crypto
csi.storage.k8s.io/node-publish-secret-namespace: longhorn-system
csi.storage.k8s.io/node-stage-secret-name: longhorn-crypto
csi.storage.k8s.io/node-stage-secret-namespace: longhorn-system
csi.storage.k8s.io/provisioner-secret-name: longhorn-crypto
csi.storage.k8s.io/provisioner-secret-namespace: longhorn-system
encrypted: 'true'
fromBackup: ''
numberOfReplicas: '2'
staleReplicaTimeout: '2880'
provisioner: driver.longhorn.io
reclaimPolicy: Delete
volumeBindingMode: Immediate
my VolumeSnapshotClass like so:
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
name: longhorn-encryption
labels:
velero.io/csi-volumesnapshot-class: "true"
driver: driver.longhorn.io
deletionPolicy: Delete
@MoZadro , I created VolumeSnapshotClass for each of my storage class created. And now testing by creating a backup from schedule, i think this will work now but will confirm once backup and restore will be successful.
SC:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
default-storage-class (default) ebs.csi.aws.com Delete WaitForFirstConsumer false 77d
efs-sc-ns efs.csi.aws.com Retain Immediate false 48d
gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 204d
Volume Snapshot Class:
csi-aws-vsc-ebs ebs.csi.aws.com Delete 11m
csi-aws-vsc-efs efs.csi.aws.com Retain 11m
csi-k8s-vsc-ebs kubernetes.io/aws-ebs Delete 11m
But I have few queries:
@gravops I don't work for Velero, so I can't provide you answers :) , i tried to help because i had similar issue with CSI plugin :)
NP, I will gather these information but many thanks for help @MoZadro . 👍 :) Appreciate it!
It started checking for PVCs now but I am getting timeouts when velero is trying to backup the PVCs.
Velero: message: /Timed out awaiting reconciliation of volumesnapshot ns-testapp2/velero-ebs-claim-test-kq7f6
name: /app message: /Error backing up item error: /error executing custom action (groupResource=volumesnapshots.snapshot.storage.k8s.io, namespace=ns-testapp2, name=velero-ebs-claim-test-kq7f6): rpc error: code = Unknown desc = timed out waiting for the condition
Now the PV backup is getting timed out, also one thing found that there is one random string which is getting attached to the PVC name(velero-ebs-claim-test-kw8cz). My PVC name is "velero-ebs-claim-test" but "kw8cz" is also attached in the logs.
Errors:
Velero: message: /Timed out awaiting reconciliation of volumesnapshot ns-testapp2-10002-dev/velero-ebs-claim-test-kw8cz
name: /app message: /Error backing up item error: /error executing custom action (groupResource=volumesnapshots.snapshot.storage.k8s.io, namespace=ns-testapp2-10002-dev, name=velero-ebs-claim-test-kw8cz): rpc error: code = Unknown desc = timed out waiting for the condition
I have below config for SC and VSC:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: default-storage-class
parameters:
encrypted: "true"
kmsKeyId: ""
type: gp3
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
VSC:
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: ebs.csi.aws.com
kind: VolumeSnapshotClass
metadata:
labels:
velero.io/csi-volumesnapshot-class: "true"
name: csi-aws-vsc-ebs
https://velero.io/docs/v1.13/csi/#implementation-choices Please check this document to find out how the VolumeSnapshotClass should be created.
For short, if you prefer to have a default VolumeSnapshotClass, please apply this label velero.io/csi-volumesnapshot-class: "true"
to that VolumeSnapshotClass.
You can also fine-tunning the VolumeSnapshotClass setting for the backup if multiple classes are needed.
I have similar issue, Although enableCSI=false
, but backups are still failing with message that the server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)
(AWS/EBS).
@makarov-roman do you have the full log line where it says which line of code the error was printed from?
Hello, what if we have multiple storageClasses on the cluster, do we need to create for each storageClass VolumeSnapshotClass object, and since we also need to add label velero.io/csi-volumesnapshot-class=true in volume snapshot class to make this snapshot class default for volume snapshot created by velero. Only one VolumeSnapshotClass can be default one ?
@MoZadro If you have multiple storage then you will need to create VSC for each one of them and also you will need to remove "velero.io/csi-volumesnapshot-class=true" from your VSC definition. I did the same and it worked for me.
So if i have multiple storageClasses and multiple VolumeSnapshotClasses i need to remove "velero.io/csi-volumesnapshot-class=true" so this parameter is not defined on any VolumeSnapshotClass ?
yes, for me it is working like that only.
@makarov-roman do you have the full log line where it says which line of code the error was printed from?
sorry, not anymore. I've migrated all environments on CSI snapshotter.
I have similar issue, Although
enableCSI=false
, but backups are still failing with message thatthe server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)
(AWS/EBS).
That means the VolumeSnapshotClass CRD was not installed in the EKS environment.
I have similar issue, Although
enableCSI=false
, but backups are still failing with message thatthe server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)
(AWS/EBS).That means the VolumeSnapshotClass CRD was not installed in the EKS environment.
it wasn't. Why is it required without enableCSI? It wasn't before and was a bit unexpected.
@makarov-roman The main action of enabling CSI is to make CSI snapshots of volumes that don't use fs-backup. This entails creating VolumeSnapshots and VolumeSnapshotContents for PVCs to back up. This won't work without a VolumeSnapshotClass for your VolumeSnapshots. If you were on 1.8 before and are on 1.13 now, it could be that you have v1beta1 VolumeSnapshotClass defined but not v1. The CSI plugin moved from v1beta1 to v1 for VS, VSC, and VSClass in either Velero 1.9 or Velero 1.10 -- I forget the exact release, but I'm pretty sure that 1.8 still used the beta version.
@sseago well, in my case the migration was done from 1.12 to 1.13 and I had the exactly the same error as OP. And I didn't have any VolumeSnapshotClass installed. (as well as useSnapshot=false and no enableCSI flag). @gravops can you confirm that it's also true for you? before the update it worked without any VolumeSnapshotClass
Several similar issues are intervened together here. @makarov-roman Although your scenario is also related to the VolumeSnapshotClass CRD, it differs from the original issue.
the server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)
That error looks like the client-go fails to get the VolumeSnapshotClass CRD from the kube-apiserver. It's more like the k8s resources discovery and collection problem, not related to back up the volume data by CSI snapshot.
thanks for response @blackpiglet ah, I think you right, it's different. We didn't use snapshots, but after the velero update from v1.12 to v1.13 runtime failed because of missing volumesnapshotclass dependency.
@makarov-roman Thanks. I see.
This is fixed in the main branch by PR #7515. But it's not cherry-picked into the 1.13 branch. I will create the cherry-pick PR.
@makarov-roman
What steps did you take and what happened: Installed latest helm chart 6.0.0 with velero image 1.13.1 and AWS plugin 1.9.1 and CSI plugin 0.7.0.
What did you expect to happen: While creating a backup from CLI using a schedule, PV snapshot backup is not working. It is failing bundle-2024-04-11-16-38-03.tar.gz
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer tovelero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add:
Environment:
velero version
):velero client config get features
):kubectl version
):/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.