Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 306 forks source link

[BUG] 10-azure.conflist file missing from node following node version upgrade #4349

Closed fozturner closed 1 month ago

fozturner commented 4 months ago

Describe the bug Following updating AKS node images to version AKSCBLMariner-V2gen2-202405.20.0 we have noticed that there is a CNI conflist file /etc/cni/net.d/10-azure.conflist that is now missing from the new nodes.

This issue is present on the even newer AKSCBLMariner-V2gen2-202405.27.0 node image too.

To Reproduce Steps to reproduce the behavior:

  1. Deploy AKS cluster with CNI enabled, for context I used this command: az aks create --resource-group $resourcegroup --name $aksclustername --outbound-type userAssignedNATGateway --aad-tenant-id <tenant_id> --enable-aad --enable-azure-rbac --enable-oidc-issuer --enable-workload-identity --max-pods 50 --network-plugin azure --node-count 2 --node-vm-size Standard_D2s_v3 --os-sku AzureLinux --pod-subnet-id <system_pod_subnet_id> --vnet-subnet-id <system_node_subnet_id> --api-server-authorized-ip-ranges <ips-to-whitelist> --tier free --dns-service-ip 10.0.0.10 --kubernetes-version "1.27.9"

  2. Log into node kubectl debug node/<node-name> -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0

  3. Navigate to cni directory and list contents

10-azure.conflist file not present.

image

Expected behavior I would expect there to be an 10-azure.conflist file present in the /host/etc/cni/net.d directory

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context Other clusters running old node versions still have the 10-azure.conflist file present. Region: UK South

Related to Azure/azure-container-networking/issues/2779 and /Azure/AgentBaker/issues/4499

fozturner commented 4 months ago

@wedaly

Additional info as requested in from issue

k8s version - 1.27.9 networkPlugin - "Azure" networkPluginMode "null" podSubnetId (if any)

We have 1 subnet for the app pods and one for system pods, these are delegated to Microsoft.ContainerService/managedClusters

"/subscriptions/{redacted}/resourceGroups/{redacted}/providers/Microsoft.Network/virtualNetworks/{redacted}/subnets/akssyspod-uat-uks-snet"

"/subscriptions/{redacted}/resourceGroups/{redacted}/providers/Microsoft.Network/virtualNetworks/{redacted}/subnets/aksapppod-uat-uks-snet"

rbtr commented 4 months ago

I see that you have 15-azure-swift.conflist, so I think everything worked as expected from our end here. What specifically is the issue with 10-azure.conflist not being present? The CRI will load the conflist correctly regardless of name, and networking should be functional here.

james-bjss commented 4 months ago

I see that you have 15-azure-swift.conflist, so I think everything worked as expected from our end here. What specifically is the issue with 10-azure.conflist not being present? The CRI will load the conflist correctly regardless of name, and networking should be functional here.

The issue is with CNIs like Kuma mesh it needs to know the CNI config file to chain to. If the file is not present then it fails as these conflist files are different between our clusters.

On one cluster the file is 10-azure.conflist another simply has 15-azure-swift.conflist despite using the same config and Azure CNI version. The only change was updating the node so we are trying to understand if there was a change to the behaviour and understand what component changed to cause this.

The only thing from our pov that changed between environments was an update to the image used by the node.

If we need to configure Kuma to use the swift conflist we can change this config but would like to understand what caused the change to happen and what controls it. Mainly so we can pre-empt issues in the future by checking release notes etc.

The relevant docs from Kong/Kuma https://docs.konghq.com/mesh/latest/production/dp-config/cni/

kumactl install control-plane \
  --set "kuma.cni.enabled=true" \
  --set "kuma.cni.chained=true" \
  --set "kuma.cni.netDir=/etc/cni/net.d" \
  --set "kuma.cni.binDir=/opt/cni/bin" \
  --set "kuma.cni.confName=10-azure.conflist" \
  | kubectl apply -f -
rbtr commented 4 months ago

@james-bjss I don't think AKS makes any guarantees of support for CNI chaining, but if that's documented somewhere, point me to it. The conflist name change doesn't break AKS networking, there isn't a contract being violated here between CNI and the CRI, and it's not a bug.

You may be able to mitigate by changing the configuration for the conflist name in your other CNI. Note that the AzCNI conflist may vary between AKS versions, base images, and AKS network modes. Ex it is 10-azure.conflist for node subnet, but 15-azure-overlay.conflist for Overlay mode. Also note that if you are mutating the conflist, AKS may reconcile it back to the target state periodically so this may not be a viable long term solution, depending on how your chaining plugin handles that.

james-bjss commented 4 months ago

@james-bjss I don't think AKS makes any guarantees of support for CNI chaining, but if that's documented somewhere, point me to it. The conflist name change doesn't break AKS networking, there isn't a contract being violated here between CNI and the CRI, and it's not a bug.

You may be able to mitigate by changing the configuration for the conflist name in your other CNI. Note that the AzCNI conflist may vary between AKS versions, base images, and AKS network modes. Ex it is 10-azure.conflist for node subnet, but 15-azure-overlay.conflist for Overlay mode. Also note that if you are mutating the conflist, AKS may reconcile it back to the target state periodically so this may not be a viable long term solution, depending on how your chaining plugin handles that.

Thanks @rbtr. I suppose we were trying to ascertain if this was expected behaviour on not. Now we know that there is no guarantee that this conflist will be present I guess the next thing to do is to report it to the Kuma project and see what they advise.

rbtr commented 4 months ago

FWIW their docs for GKE basically say to do this (check what the conflist is named, and use that name):

image

fozturner commented 4 months ago

I have some additional information on this.

Firstly, if we do use the 15-azure-swift.conflist file in our service mesh the pods falls into a CrashLoopBackOff.

But after some further testing it appears that if I create a new cluster and do NOT set the --pod-subnet-id, then the nodes do have the 10-azure.conflist file present. If I do set the --pod-subnet-id then the nodes have the 15-azure-swift.conflist file present instead.

It appears setting the --pod-subnet-id sets configuration to deploy the azure-dns daemonset.

The clusters we are running have had the --pod-subnet-id set since day 1 and have been running quite happily, so whatever has/is being implemented for the "swift" configuration is breaking something.

rbtr commented 3 months ago

With everything else equal, this flag controls whether the cluster is provisioned as legacy Node Subnet (without) or Dynamic Pod Subnet (with). There will be other effects besides the different conflist path; for example, Node Subnet mode permanently reserves MaxPods (default: 30) IPs out of the subnet per Node

microsoft-github-policy-service[bot] commented 2 months ago

Action required from @aritraghosh, @julia-yin, @AllenWen-at-Azure

microsoft-github-policy-service[bot] commented 2 months ago

Issue needing attention of @Azure/aks-leads

microsoft-github-policy-service[bot] commented 1 month ago

This issue will now be closed because it hasn't had any activity for 7 days after stale. fozturner feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.