Open withinboredom opened 10 months ago
It will likely need a support bundle (maybe one from each cluster). Is it possible to mail it to longhorn-support-bundle@Suse.com?
It looks the mutating webhook somehow doesn't work. @james-munson Can you help check this part? Thank you.
https://github.com/longhorn/longhorn-manager/blob/v1.5.3/webhook/resources/volume/mutator.go#L59-L61
Agreement here. @ejweber notes (in Slack)
Ran some quick tests. It is fine to create a volume with those fields set to empty. Our mutating webhook mutates them BEFORE (I think) Kubernetes does any validation. However, deleting the mutatingwebhookconfiguration and then creating a volume CR with snapshotDataIntegrity: "" yields:
k apply -f volume.yaml The Volume "test" is invalid: spec.snapshotDataIntegrity: Unsupported value: "": supported values: "ignored", "disabled", "enabled", "fast-check"
I think the user's mutating webhook is broken.
and @PhanLe1010
Currently the webhook has failurePolicy: Fail so if the request fail at the mutationwebhook level I would expect a different error like fail to reach/connection refused. Aka, I agree that the manager and webhook are functional. Attention is returned to whether the MutatingWebhookConfiguration exist and if it has correct config.
I worked around the issue by deleting ALL backups (simply taking another backup wouldn't resolve the problem) and then taking a new backup. I just wanted to let you know that this worked for me.
However, it does concern me that this can happen in disaster recovery scenarios. Could this backup series be corrupted at some point and never resolved by taking more backups?
I was unable to send the support bundle to the email address. I'll upload it to s3 before this weekend.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
I remember we've improved the webhook in v1.6.0. @james-munson Do you remember which one?
Describe the bug
When loading a backup (migrating a workload to a new cluster, both longhorn are version 1.5.3, exact same settings between each), the backup fails to be imported due to the following error (formatting mine):
Note: most other volumes were able to be migrated/restored just fine.
I've checked the workarounds (thinking it was like #6582) but that does not apply.
To Reproduce
Not sure.
Expected behavior
To be able to restore a backup, even if an unsupported value is present (it should just use the default and show a warning, IMHO. Backups should always be able to be restored).
Support bundle for troubleshooting
Support bundles:
Too big to upload (147mb) but available upon request.
Environment
5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Additional context