Open chkp-ilyaro opened 5 months ago
Hi,
I found the problem limitation
Unsupported value: "Ubuntu2004": supported values: "Ubuntu2204", "AzureLinux" fips_ebabled node is only Ubuntu2004 but Karpenter for Azure doesn’t support it So Karpenter and fips_enabled nodes can’t work together currently
Fips_enabled imageversion path is here: /subscriptions/109a5e88-712a-48ae-9078-9ca8b3c81345/resourceGroups/AKS-Ubuntu/providers/Microsoft.Compute/galleries/AKSUbuntu/images/2004gen2fipscontainerd/versions/202405.27.0 The info can be taken from VMSS in Azure
/subscriptions/109a5e88-712a-48ae-9078-9ca8b3c81345/resourceGroups/AKS-Ubuntu/providers/Microsoft.Compute/galleries/AKSUbuntu/images/2004gen2fipscontainerd/versions/202405.27.0 The info can be taken from VMSS in Azure
Also note the galleries you are sharing for image version, are for SIG, and not Community Image galleries which is what karpenter uses today. We do not publish the fips images for community image galleries.
cc: @rakechill maybe something SIG gallery support can enable?
Version
Karpenter Version: v0.0.0
Kubernetes Version: v1.0.0
Expected Behavior
When deploying nodepool.yaml from example as is only with 1 change We need to deploy nodegroup with fips-enabled support AKSUbuntu-2004fipscontainerd-202405.27.0 it is imageVersion I deployed AKS from TF and az and this version were used
apiVersion: karpenter.azure.com/v1alpha2 kind: AKSNodeClass metadata: name: default annotations: kubernetes.io/description: "General purpose AKSNodeClass for running Ubuntu2204 nodes" spec: imageVersion: AKSUbuntu-2004fipscontainerd-202405.27.0
Actual Behavior
RESPONSE 404: 404 Not Found\nERROR CODE: GalleryImageNotFound\n--------------------------------------------------------------------------------\n{\n \"error\": {\n \"code\": \"GalleryImageNotFound\",\n \"message\": \"\\"The gallery image /CommunityGalleries/AKSUbuntu-38d80f77-467a-481f-a8d4-09b6d4220bd2/images/2204gen2containerd/versions/AKSUbuntu-2004fipscontainerd-202405.27.0 is not available in eastus region. Please contact image owner to replicate to this region, or change your requested region.
The image is not found in any region I tested all IMHO the path is wrong,
/CommunityGalleries/AKSUbuntu-38d80f77-467a-481f-a8d4-09b6d4220bd2/images/2204gen2containerd/versions/AKSUbuntu-2004fipscontainerd-202405.27.0
The images families is hardcoded https://github.com/Azure/karpenter-provider-azure/blob/5ee206b44978eeb537a1d08f347fda6742b5181f/pkg/providers/imagefamily/types.go#L26
Steps to Reproduce the Problem
Do the same steps as in README and add imageVersion spec: imageVersion: AKSUbuntu-2004fipscontainerd-202405.27.0
Resource Specs and Logs
I deployed inflate deployment k get po NAME READY STATUS RESTARTS AGE inflate-5f57665f58-hcddq 1/1 Running 0 4d1h inflate-5f57665f58-j59tc 0/1 Pending 0 4d1h inflate-5f57665f58-nrkxg 0/1 Pending 0 4d1h inflate-5f57665f58-p7rv5 0/1 Pending 0 4d1h inflate-5f57665f58-r4lfq 0/1 Pending 0 3d21h
This is the log we get
$ kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller {"level":"DEBUG","time":"2024-06-23T13:31:56.914Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"}{"level":"DEBUG","time":"2024-06-23T13:32:18.928Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"INFO","time":"2024-06-23T13:32:19.321Z","logger":"controller.nodeclaim.lifecycle","message":"Selected instance type Standard_D8ls_v5","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"INFO","time":"2024-06-23T13:32:19.322Z","logger":"controller.nodeclaim.lifecycle","message":"Resolved image /CommunityGalleries/AKSUbuntu-38d80f77-467a-481f-a8d4-09b6d4220bd2/images/2204gen2containerd/versions/microsoft-aks:aks-aez:aks-ubuntu-containerd-2204-gen2-2023-q2:2023.04.10 for instance type Standard_D8ls_v5","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:19.323Z","logger":"controller.nodeclaim.lifecycle","message":"Returning 2 IPv4 backend pools: [/subscriptions/4b-XXXXX-XXXXXXXXX-XXXXXXXXXX/resourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/aksOutboundBackendPool /subscriptions/4bXXXXXXXXXXXXXXXX-XXXXXXXX-XXXXXXXXX/resourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Network/loadBalancers/kubernetes/backendAddressPools/kubernetes]","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:19.323Z","logger":"controller.nodeclaim.lifecycle","message":"Creating network interface aks-general-purpose-87qcx","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:19.929Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"DEBUG","time":"2024-06-23T13:32:19.956Z","logger":"controller.nodeclaim.lifecycle","message":"Successfully created network interface: /subscriptions/4bXXXXXX-XXXXXXXX-XXXXXXX/resourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Network/networkInterfaces/aks-general-purpose-87qcx","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:19.956Z","logger":"controller.nodeclaim.lifecycle","message":"Creating virtual machine aks-general-purpose-87qcx (Standard_D8ls_v5)","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:20.930Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"DEBUG","time":"2024-06-23T13:32:21.931Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"DEBUG","time":"2024-06-23T13:32:22.219Z","logger":"controller.provisioner","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"ERROR","time":"2024-06-23T13:32:22.635Z","logger":"controller.nodeclaim.lifecycle","message":"Creating virtual machine \"aks-general-purpose-87qcx\" failed: PUT https://management.azure.com/subscriptions/4bXXXXXXX_XXXXXXXX_XXXXX/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Compute/virtualMachines/aks-general-purpose-87qcx\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 Not Found\nERROR CODE: GalleryImageNotFound\n--------------------------------------------------------------------------------\n{\n \"error\": {\n \"code\": \"GalleryImageNotFound\",\n \"message\": \"\\"The gallery image /CommunityGalleries/AKSUbuntu-38d80f77-467a-481f-a8d4-09b6d4220bd2/images/2204gen2containerd/versions/microsoft-aks:aks-aez:aks-ubuntu-containerd-2204-gen2-2023-q2:2023.04.10 is not available in eastus region. Please contact image owner to replicate to this region, or change your requested region.\\"\",\n \"target\": \"imageReference\"\n }\n}\n--------------------------------------------------------------------------------\n","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"DEBUG","time":"2024-06-23T13:32:22.931Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"bbaa9b7"} {"level":"ERROR","time":"2024-06-23T13:32:23.161Z","logger":"controller.nodeclaim.lifecycle","message":"networkInterface.Delete for aks-general-purpose-87qcx failed: DELETE https://management.azure.com/subscriptions/4b-XXXXXXXXXX-XXXXXXXX/ResourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Network/networkInterfaces/aks-general-purpose-87qcx\n--------------------------------------------------------------------------------\nRESPONSE 400: 400 Bad Request\nERROR CODE: NicReservedForAnotherVm\n--------------------------------------------------------------------------------\n{\n \"error\": {\n \"code\": \"NicReservedForAnotherVm\",\n \"message\": \"Nic(s) in request is reserved for another Virtual Machine for 180 seconds. Please provide another nic(s) or retry after 180 seconds. Reserved VM: /subscriptions/4b-XXXXXXX_XXXXXXX_XXXX/resourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Compute/virtualMachines/aks-general-purpose-87qcx\",\n \"details\": []\n }\n}\n--------------------------------------------------------------------------------\n","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"ERROR","time":"2024-06-23T13:32:23.161Z","logger":"controller.nodeclaim.lifecycle","message":"failed to cleanup resources for node claim general-purpose-87qcx, %!w(*errors.joinError=&{[0xc0019aae20]})","commit":"bbaa9b7","nodeclaim":"general-purpose-87qcx"} {"level":"ERROR","time":"2024-06-23T13:32:23.162Z","logger":"controller","message":"Reconciler error","commit":"bbaa9b7","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"general-purpose-87qcx"},"namespace":"","name":"general-purpose-87qcx","reconcileID":"0e1c39df-c71a-436a-aa24-dc6395651ae7","error":"launching nodeclaim, creating instance, virtualMachine.BeginCreateOrUpdate for VM \"aks-general-purpose-87qcx\" failed: PUT https://management.azure.com/subscriptions/4b-xxxxxxxx-xxxx/resourceGroups/MC_use-dt-stg-rg_testAKSfipsKarp_eastus/providers/Microsoft.Compute/virtualMachines/aks-general-purpose-87qcx\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 Not Found\nERROR CODE: GalleryImageNotFound\n--------------------------------------------------------------------------------\n{\n \"error\": {\n \"code\": \"GalleryImageNotFound\",\n \"message\": \"\\"The gallery image /CommunityGalleries/AKSUbuntu-38d80f77-467a-481f-a8d4-09b6d4220bd2/images/2204gen2containerd/versions/microsoft-aks:aks-aez:aks-ubuntu-containerd-2204-gen2-2023-q2:2023.04.10 is not available in eastus region. Please contact image owner to replicate to this region, or change your requested region.\\"\",\n \"target\": \"imageReference\"\n }\n}\n--------------------------------------------------------------------------------\n"}
Community Note