Open robindv opened 5 months ago
In my institutional memory we've never actually waited on nodes to be ready because readiness can flap. Instead we wait for CSE to return which just garuntees's node registration. But we may have actually made this more visible/variable by not having nodes mark themselves as ready until cni is actually ready to process.
Checking what k watch nodes looks like with kubenet vs overlay @tyler-lloyd for fun.
Here's a pretty boring vanilla cluster. Are you using anythign interesting on your nodes or your network setup?
-> % k get nodes -w | /usr/bin/ts
Sep 10 06:11:31 NAME STATUS ROLES AGE VERSION
Sep 10 06:11:31 aks-nodepool1-26445000-vmss000000 Ready
Action required from @aritraghosh, @julia-yin, @AllenWen-at-Azure
Had another customer bring this up with regard to the following taint "node.cloudprovider.kubernetes.io/uninitialized" still being on replacement/surge nodes when an upgrade was drained.
AKS will never be perfect in this regards as readiness and taints are dynamic and can come at any time. Your best line of defense for critical applications is define pod disruption budgets as that will block drains/evictions regardless of the reason the new pod can't come up whether its due to node, pod itself or otherwise.
ASK could be better here though we could start with a whitelisted set of node conditions and taints that we know are likely to occur at startup even if only briefly and wait some amount of time T for them all top clear once (ignoring if they come back)
Using all conditions / taints is hard as some make flap and some may be applied by customer.
Generally we haven't invested in this because we don't see it that often (though looking into the data there), its not trivial to orchestrate and you probably want PDBs anyways.
Please upvote if you'd like to see improvements in this area.
Thanks for the suggestion, I'll have a closer look to the pod disruption budgets to arm myself against this behaviour :-)
Describe the bug I recently upgraded three AKS clusters from 1.29.2 to 1.29.4 and I noticed the temporary node that is added to the pool before the upgrade only becomes ready after the first node has been removed from the cluster.
In the past the upgrade only began when the extra temporary node was ready without any taints. Because of this changed behavior, there is a moment (~one minute) where on a single-node nodepool no node is available.
To Reproduce Steps to reproduce the behavior:
Expected behavior The temporary node is added to the nodepool and ready without any taints before the upgrade of the other nodes start.
Environment (please complete the following information):