Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 308 forks source link

Activate Azure Hybrid Benefit for AKS #3913

Closed sazadpour-microsoft closed 8 months ago

sazadpour-microsoft commented 1 year ago

When Azure Hybrid Benefit for AKS is activated through the following command it starts reimaging the nodes. It takes a long time for the cluster to pull the images since those container images are too large. Does it necessary to reimage the nodes when Azure Hybrid Benefit just change the licensing? Is there a way to avoid reimaging the node but using the Azure Hybrid Benefit? Even if the command directly ran on the VMSS AKS requires it to run again! Would you please to check if it is a bug. Thank you in advance for your support.

Command:

az aks update \ --resource-group myResourceGroup --name myAKSCluster --enable-ahub

Articles: https://azure.microsoft.com/en-au/updates/generally-available-azure-hybrid-benefit-for-aks-and-azure-stack-hci/ https://learn.microsoft.com/en-us/azure/aks/windows-faq?tabs=azure-cli#can-i-use-azure-hybrid-benefit-with-windows-nodes https://learn.microsoft.com/en-us/azure/aks/hybrid/azure-hybrid-benefit?tabs=powershell

Jamie0 commented 10 months ago

I note that the documentation says "Changing the license type on the VM does not cause the system to reboot or cause a service interuption. It is simply an update to a metadata flag", so this may be a bug?

PixelRobots commented 10 months ago

This blog post came out today and states it will recreate the Nodes.

https://techcommunity.microsoft.com/t5/itops-talk-blog/reducing-costs-for-windows-workloads-on-azure-kubernetes-service/ba-p/4025117

@vrapolinario who created the blog could clear up the issue.

vrapolinario commented 10 months ago

Let me look into this and see if I can clarify.

Jamie0 commented 10 months ago

Thank you @vrapolinario and @PixelRobots!

I'd be grateful if you could also seek clarification on the behaviour if you enable AHB on specific nodes, rather than cluster-wide, as described in the documentation. In our case, the Windows nodes were not re-imaged right away, but at the next cluster scale operation (which happened to be on a different node pool!).

As the documentation said enabling AHB was simply a metadata flag only, we didn't expect any reboots/re-images, which caused an unexpected outage when our cluster scaled a few hours later.

We do have a very slowly moving support ticket relating to this (2401050050000857)

vrapolinario commented 10 months ago

What is the doc that you are following to change the config on a per node level?

Jamie0 commented 10 months ago

This one: https://learn.microsoft.com/en-us/azure/aks/windows-faq?tabs=azure-cli#can-i-use-azure-hybrid-benefit-with-windows-nodes

"For individual nodes, you need to browse to the node resource group and apply the Azure Hybrid Benefit to the nodes directly. For more information on applying Azure Hybrid Benefit to individual nodes, see Azure Hybrid Benefit for Windows Server."

AbelHu commented 8 months ago

AKS nodes are managed by AKS. AKS needs to enable AHUB in all existing AKS Windows nodes when a user sends an update request. AKS updates the Windows VMSS model and then update the VMSS instances to the latest model to keep the state consistent. It follows the AKS workflow to do the same update logic for existing agentpools/agentnodes so it will trigger a full node upgrade operation and will not break any customer workloads. And it should be one-time work for existing clusters. Please free to file a support ticket if it causes unexpected downtime in enabling AHUB in AKS clusters.

AbelHu commented 8 months ago

Licencetype of part of VM profile. AKS is usually manual upgrade mode, so just like other properties, AKS need to manual upgrade the instance after licencetype changed in VMSS model to apply to VM instances.