Set AntiAffinityClassNames on AKS-HCI VMs

nmdange2 commented 1 year ago

Title: Set AntiAffinityClassNames on AKS-HCI VMs

Description: This is a feature request to make use of Hyper-V's built-in anti-affinity rules to ensure AKS VMs do not run on the same host. I have a 4-node AKS-HCI cluster, and I noticed on several occasions that multiple VMs within the same node pool ended up on the same physical host. This is not ideal for HA. If all the worker nodes in a pool are running on the same physical host, and that physical host goes down, then all workloads in that pool also go down. Pods with multiple replicas should run on separate physical hosts where possible.

Each control plane and worker node pool should have a unique anti-affinity class name assigned so that the Failover Cluster will ensure the VMs run on different physical hosts, but VMs in different node pools can still run on the same host. This could be computed based on the AKS cluster name + worker node pool name (or "controlplane" for the control plane VMs)

Description of Anti-affinity feature: https://learn.microsoft.com/en-us/windows-server/failover-clustering/cluster-affinity

Elektronenvolt commented 1 year ago

@baziwane - this is what we've discussed a while ago for control plane and load balancer nodes (Kubernetes cluster critical nodes). @nmdange2 - Doing this for node pools as well is a good idea too in our case. Our physical Hyper-V nodes can host a lot of VMs - if we create a small "special purpose" (labeled) node pool, all VMs may sit on the same physical node. In case of a physical node failure - the deployed application can't re-schedule - no more nodes with that label in the cluster.

baziwane commented 1 year ago

That's correct - issue #76. We are still tracking this on the roadmap.

Azure / aksArc

Set AntiAffinityClassNames on AKS-HCI VMs #281