Closed daniel-weisse closed 2 weeks ago
do you have any specific "how-to-test" instructions?
Following the instructions from the README should be enough
edit: I just doubled checked against the official CAA set up for Azure, which deploys Peer Pod VMs in its own separate resource group.
This will cause issues because the CSI driver cannot attach disks it created in the AKS resource group to the VMs in the other resource group.
To fix this, change all references of $AZURE_RESOURCE_GROUP
in the azure deployment guide to $AKS_RG
.
I'll check if its easily possible to configure the csi driver to create the disks outside the AKS resource group
do you have any specific "how-to-test" instructions?
Following the instructions from the README should be enough
edit: I just doubled checked against the official CAA set up for Azure, which deploys Peer Pod VMs in its own separate resource group. This will cause issues because the CSI driver cannot attach disks it created in the AKS resource group to the VMs in the other resource group. To fix this, change all references of
$AZURE_RESOURCE_GROUP
in the azure deployment guide to$AKS_RG
.I'll check if its easily possible to configure the csi driver to create the disks outside the AKS resource group
thanks for the headsup. the $AKS_RG is sort of managed, i.e. if a cluster is removed the RG is discarded, too. So, we avoid putting resources in that RG if that's feasible. especially for disks we might not want this
Managed to come up with a somewhat clean solution that allows following the documented way of setting up CAA on AKS. The PR should now be fully functional following the instructions from the docs for Azure and the README from this PR
I'm currently following the guide using a fresh AKS + CAA installation from main and I don't seem to get the azuredisk-csi-driver to work (Deploy azuredisk-csi-driver on the cluster + Option A). Somehow the mount is not set up properly:
for the azurefile pv:
kubectl exec nginx-pv -c nginx -- mount | grep mount-path
//....file.core.windows.net/pvc-40dd3dcf-603c-4877-a3e4-b489482bfc44 on /mount-path type cifs (rw,relatime,vers=3.1.1,cache=strict,username=...,uid=0,noforceuid,gid=0,noforcegid,addr=...,file_mode=0777,dir_mode=0777,soft,persistenthandles,nounix,serverino,mapposix,mfsymlinks,reparse=nfs,rsize=1048576,wsize=1048576,bsize=1048576,retrans=1,echo_interval=60,nosharesock,actimeo=30,closetimeo=1)
for azuredisk pv:
$ kubectl exec nginx-pv-disk -c nginx -- mount | grep mount-path
tmpfs on /mount-path type tmpfs (rw,relatime,size=1912876k,nr_inodes=1048576,mode=755,inode64)
the disk is attached to the podvm:
az disk show -n pvc-e1355511-b329-47e2-b539-1a8600eb5930 -g mgns | jq -c '[.diskState, .managedBy]'
["Attached","/subscriptions/..../resourceGroups/mgns/providers/Microsoft.Compute/virtualMachines/podvm-nginx-pv-disk-f6e92b43"]
The mount path being a tempfs seems very strange to me and its a problem I haven't seen before while trying to get the azure driver running. I used a script to set up my cluster, so its possible I missed something. I'll try to investigate this more tomorrow
The mount path being a tempfs seems very strange to me and its a problem I haven't seen before while trying to get the azure driver running. I used a script to set up my cluster, so its possible I missed something. I'll try to investigate this more tomorrow
I'll leave my cluster in this state, feel free to reach out on slack for debugging
Squashed the last 3 commits, should be good to merge now once test ran again
The csi-wrapper failure should get fixed with rebase as the base image has changed. I'm rebasing and merging this
As a follow up to https://github.com/confidential-containers/cloud-api-adaptor/pull/2108 and https://github.com/confidential-containers/cloud-api-adaptor/pull/2106, this PR adds the required changes to enable the csi-wrapper for the azuredisk-csi-driver. Changes are required in two places:
/
characters, which are illegal for K8s resource names. This is fixed by using the base name of the resource ID instead.ControllerPublishVolume
for the Pod VM, we cannot use the peerpod volumeVMID
field to replaceNodeID
of theControllerPublishVolumeRequest
as, once again, it's the full Azure resource ID, and the azuredisk-csi-driver only expects the name of the VM to publish the volume on. This is also fixed by using the base name instead.It also includes examples for using the azuredisk-csi-driver to create a Pod consuming a dynamically provisioned PVC or a statically provisioned PVC.