Closed slawekww closed 1 year ago
I'm not sure however found that VolumeSnapshot is created with nodeAffinity and point to exact Availability zone used by volume. Even if I want to restore it on different AKS cluster, it must be in the same Azure region and node must be in the same Availability zone. Volume NodeAffinity at VolumeSnapshot may block to store snapshot on different Azure region - I do not know for sure, just suspect it.
@slawekww Do you have to use CSI Snapshotter? If you are relying on azure plugin for the snapshot in v1.10 it would be possible to take snapshot into different VSL.
However, it is a limitation for CSI Plugin. @blackpiglet could you open an issue to address this requirement in the scope of CSI plugin in particular?
@slawekww Do you have to use CSI Snapshotter? If you are relying on azure plugin for the snapshot in v1.10 it would be possible to take snapshot into different VSL.
I rely on Velero plugins:
AKS version 1.24.6 has CSI storage classes/drivers installed automatically using version 1.24.0.2. mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.24.0.2
@slawekww
Let me clarify, there are two code paths to take snapshots on azure, you may choose NOT to rely on the velero-plugin-for-csi
, b/c in that case the plugin velero-plugin-for-microsoft-azure
will call Azure API to take the snapshot for the underlying disk. To do that you may try no to turn on the CSI feature flag when you install velero.
There is a limitation in CSI plugin being not able to take snapshot via CSI snapshot API for different vsClasses, #5750 has been opened to track this work, but it won't be implemented in v1.11.
@reasonerjt Thank you for guidance!
I will test this scenario without using velero-plugin-for-csi
plugin and let you know results.
Run test with Velero and disabled velero-plugin-for-csi
- result is that backup failed.
Regardless if VolumeSnapshotClass CR has parameters.resourcegroup fill in or not,
Velero was not able to find pv disk as it always pointing to Backup RSG instead of original RSG where AKS is deployed.
Log with error:
time="2023-01-09T13:01:17Z" level=error msg="Error backing up item" backup=velero/velero-test-20230109130055 error="error getting volume info: rpc error: code = Unknown desc = compute.DisksClient#Get: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code=\"ResourceNotFound\" Message=\"The Resource 'Microsoft.Compute/disks/pvc-68a0781c-1ce8-4657-94f1-d0c019a2386d' under resource group BACKUP_RSG' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix\"" logSource="pkg/backup/backup.go:425" name=test-85b657899-2th4r
Once updated velero-credentials secret to contain AZURE_RESOURCE_GROUP as AKS resource group, backup is successful however snapshots are stored in AKS resource group instead of Backup_RSG as defined by VolumeSnapshotClass. Note: velero backupStorageLocation points always to Storage Account at Backup_RSG.
The VolumeSnapshotLocation
rather than VolumeSnapshotClass
will be used if you disable the CSI plugin.
For your case, you should
VolumeSnapshotLocation
as Backup_RSG
and set --volume-snapshot-locations
as this VolumeSnapshotLocation
when creating the backup. This resource group is used to store the snapshotsThanks! I had created Azure snapshots into two different Azure resource groups when plugin velero-plugin-for-csi
is disabled.
However snapshot location (region) is always the same location as AKS cluster.
Is it any option to create snapshots into two different locations in Azure?
I don't think you can create the snapshots at a different region with the disk/AKS cluster.
Let assume Azure snapshots are copied manually into different Azure resource group (location) using Azure SDK API or az cli command.
Could you advice what should be changed in Velero stored backup files to use copied Azure snapshots? Is it even possible to re-use copied Azure snapshots?
I'm not sure whether it is possible or not, we didn't test this use case.
Maybe you can go through the logic here to do more investigation and testing
Lets close this issue as basically it is possible to create Snapshots into two different Azure resource groups.
Snapshots are always in the same Azure location (region) as AKS cluster regardless what location is used by Azure resource group and it is Azure limitation.
I may do more testing to update Velero backup files velero-\
{
"spec": {
"backupName": "velero-default-20230111000057",
"backupUID": "457259a5-619f-4147-a246-9fd652a17370",
"location": "default",
"PersistentVolumeName": "pvc-id",
"providerVolumeID": "pvc-volumeid",
"volumeType": "StandardSSD_LRS",
"volumeAZ": "regionorig-2" # change to regiontarget-availabilityzoneid
},
"status": {
"providerSnapshotID": "/subscriptions/subId/resourceGroups/Resoure_Group_CopiedSnapshot/providers/Microsoft.Compute/snapshots/pvc-id-79038e1d-172f-4611-91ed-b9fa91738dd6",
"phase": "Completed"
}
and try to run Velero restore using those files but it is really hack way and it may not work.
Let assume Azure snapshots are copied manually into different Azure resource group (location) using Azure SDK API or az cli command.
Could you advice what should be changed in Velero stored backup files to use copied Azure snapshots? Is it even possible to re-use copied Azure snapshots?
Even this has been closed, maybe it will help. I have tested this scenario and it definitely works pretty well. Have been doing that in order to be able to do cross-region cluster restore. What needs to be changed is the volumesnapshots.json.gz where you need to patch "providerSnapshotID" and change the resource group to the new one (location of your copied snapshots).
Additionally, you need to patch nodeAffinity of the actual PV resource from the backup tar file (change old region name with the new region name).
The pain here is the actual snapshot copy process and metadata manual patching.
What steps did you take and what happened: Use velero helm chart 3.0.0 using app version 1.10.0 in AKS 1.24.6: deploy 2 velero instances in two namespaces: test-1 and test-2 using two different StorageAccounts in two different Azure Resource groups.
Define VolumeSnapshotClass:
Define VolumeSnapshotLocation:
What did you expect to happen: I expect that each instance of velero stores Azure Snapshots volumes into different Azure resource group. Now only VolumeStorageClass vsc-rg1 is used and regardless settings in VolumeSnapshotLocation, snapshot volume is always stored on my-rg-1. If it would be possible to define many VolumeSnapshotLocation and use one velero instance, I welcome to use it however helm chart allows to define only one.
The following information will help us better understand what's going on:
Collected debug logs and attached however there is no error in backup but snaphots are stored in wrong location. bundle-2023-01-04-11-04-47.tar.gz
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
velero version
): 1.10.0velero client config get features
): EnableCSIkubectl version
): AKS 1.24.6/etc/os-release
): Ubuntu 18.04 LTS (standard AKS node)Vote on this issue!
It should be allowed to store snapshots into two different Resource Groups by velero from one cluster.