Closed gh-tek closed 7 months ago
Shortly after writing this, I figured out that this is related to a rather recent change on external snapshotter. I have version 7.0.1. https://github.com/kubernetes-csi/external-snapshotter/tree/v7.0.1
I found out that there is a parameter --prevent-volume-mode-conversion and its default value has recently changed from false to true as that feature is progressing. That actually forces the validation for the aforementioned field.
I was able to give --prevent-volume-mode-conversion=false and then Velero started working, backup went just fine.
So I guess this may not be a bug, but more of a feature request. Velero probably needs some changes because of that change in snapshotter. I guess it should work with default setting too.
Anyway, backup works for me right now, but if you need help with testing this, I can help with that.
This is due to the low SDK version velero is using where Spec.SourceVolumeMode
field doesn't exist. So the problem only happens in the environment that uses external snapshotter v7.0 and higher.
We have the same issue with storage class driver linstor.csi.linbit.com
Operation Error: error to expose snapshot: error to remove protect from volume snapshot content: error to update VolumeSnapshotContent snapcontent-2a34e711-1149-4dd5-a80a-2161bac0fca6: admission webhook "snapshot-validation-webhook.snapshot.storage.k8s.io" denied the request: Spec.SourceVolumeMode is immutable but was changed from Filesystem to nil
However the issue is floating and sometimes backup works. No idea why it's still working from time to time
velero version: v1.12.2, csi addon 0.7.0 K8s version: v1.27.5, bare metal CSI dirver: piraeusdatastore CSI driver v1.3.0, snapshot-controller 7.0.1
Hi @gh-tek @ksyblast, I upgraded several packages used by Velero to fix this issue, but we have no environment with the latest version of the snapshot-controller
installed. Do you have any chance to help us verify the fix with the images yinw/velero:lib-bump01
and yinw/velero-plugin-for-csi:lib-bump01
?
Hi @ywk253100,
Sorry for the delay, gmail desided to throw my emails to Junk and I missed your message :)
I tested your images:
I have been on Velero 1.13, csi plugin 0.7.0 and external snapshotter 6.3.3. Which is a working setup.
I upgraded external snapshotter to 7.0.1 and tested taking backup. It failed as expected. I then switched to the bump -images you gave and retested same backup. They now work flawlessly with newest external snapshotter!
Based on this test, should be ok to merge. I will now go back to old version and wait for actual release. Thanks!
What steps did you take and what happened: Backing up using volume snapshot with data movement to S3 storage, backup ends "PartiallyFailed". Snapshot data is not moved (fails on expose already), snapshot content resources are set to be deleted, but they are not removed, because removal of protection finalizer fails, preventing removal.
Error msg states that expose fails, protection removal actually tries to set wrong (unrelated) field spec.SourceVolumeMode on snapshot content (clip from log below point's to the code trying that removal, it's pretty recent code)
What did you expect to happen: There should not be change attempts on immutable fields ever, I guess that information is lost somewhere, or expected that it is unset (in my case it is 'Filesystem', not nil). Protection removal should succeed, snapshot content be removed and backup end with success.
The following information will help us better understand what's going on:
Debug bundle has too much information to be put out publicly, but here is node-agent log containing full error:
Snapshot content is left in state:
...
Anything else you would like to add:
This could well be configuration error on my side (which I haven't figured out yet), but I think that trying to change immutable field on snapshot content resource may be some kind of bug, that should not happen... I'm also using the latest external snapshotted on my cluster, with latest snapshot CRD's.
Also, I have verified that snapshotting works when you use
--snapshot-move-data=false
. Snapshot is saved and is restorable. But when trying to use default snapshot mover, it fails before moving anything (it fails on expose already).This is a new install, I have never used Velero before, so I have no experience from older versions. But snapshotting works without data move and also FSB works (but I would rather use snapshotting with data move).
Environment:
velero version
): 1.13.0, csi add-on 0.7.0velero client config get features
): EnableCSIkubectl version
): 1.28Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.