What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
Server Version: v1.28.9-eks-036c24b
What environment is this in?:
AWS EKS
What did you expect to happen?:
Expected the cluster-autoscaler to recognize it could scale down a node, and move a coredns pod to another node, while still honoring maxSkew=1.
Current configuration has coredns Deployment of 5 replicas, with TopologySpreadConstraints where topologyKey = "kubernetes.io/hostname", labelSelector = {"matchLabels":{"k8s-app":"kube-dns"}}, whenUnsatisfiable = "DoNotSchedule", maxSkew = 1.
If we currently have 5 nodes and 5 coredns pods like:
1 1 1 1 1
Then after some time, cluster autoscaler determines we no longer need 5 nodes for workload and we can scale down to 4 to save money. What I want to happen is something like:
1 1 1 1 1 -----> 2 1 1 1
This should still be a valid configuration for maxSkew=1.
What happened instead?:
During cluster-autoscaler scale-down simulation (in exact scenario described above), the cluster-autoscaler logs show the below failure. It's unable to scale-down the node even though maxSkew=1 will still be honored after this node is deleted. I'm guessing the cluster-autoscaler is including the node it wants to remove while calculating skew, and thus skew = 2 - 0 (global minimum since node to be deleted will have no coredns pods) = 2 > 1 = maxSkew. Therefore it claims it can't put pod on any node, and is treating topologySpreadConstraint like a podAntiAffinity rule.
19:56:09.838749 1 cluster.go:155] ip-10-177-149-54.ec2.internal for removal
19:56:09.839185 1 klogx.go:87] failed to find place for kube-system/coredns-568: cannot put pod coredns-568 on any node
19:56:09.839209 1 cluster.go:175] node ip-10-177-149-54.ec2.internal is not suitable for removal: can reschedule only 0 out of 1 pods
When I increase maxSkew = 2, cluster-autoscaler is able to scale down the unneeded nodes, and honors the maxSkew=2. Issue only seems to occur when maxSkew = 1. In other circumstances, like no TopologySpreadConstraints, cluster-autoscaler is also able to move coredns pods to other nodes and scale down.
How to reproduce it (as minimally and precisely as possible):
Scale up Nodes to above normal amount
Have a Deployment with replicas >= nodeCount and topologySpreadConstraints similar to config below, making sure maxSkew=1.
Kill whatever done to scale up Nodes (if used) and watch cluster-autoscaler try but fail to scale down unneeded nodes.
Anything else we need to know?:
Deployment Config (labelSelector specific for coredns)
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
Component version: amazonaws.com/cluster-autoscaler:v1.28.0
What k8s version are you using (
kubectl version
)?:kubectl version
OutputWhat environment is this in?:
AWS EKS
What did you expect to happen?:
Expected the cluster-autoscaler to recognize it could scale down a node, and move a coredns pod to another node, while still honoring maxSkew=1.
Current configuration has coredns Deployment of 5 replicas, with TopologySpreadConstraints where topologyKey = "kubernetes.io/hostname", labelSelector = {"matchLabels":{"k8s-app":"kube-dns"}}, whenUnsatisfiable = "DoNotSchedule", maxSkew = 1.
If we currently have 5 nodes and 5 coredns pods like:
1 1 1 1 1
Then after some time, cluster autoscaler determines we no longer need 5 nodes for workload and we can scale down to 4 to save money. What I want to happen is something like:
1 1 1 1 1 -----> 2 1 1 1
This should still be a valid configuration for maxSkew=1.
What happened instead?:
During cluster-autoscaler scale-down simulation (in exact scenario described above), the cluster-autoscaler logs show the below failure. It's unable to scale-down the node even though maxSkew=1 will still be honored after this node is deleted. I'm guessing the cluster-autoscaler is including the node it wants to remove while calculating skew, and thus skew = 2 - 0 (global minimum since node to be deleted will have no coredns pods) = 2 > 1 = maxSkew. Therefore it claims it can't put pod on any node, and is treating topologySpreadConstraint like a podAntiAffinity rule.
When I increase maxSkew = 2, cluster-autoscaler is able to scale down the unneeded nodes, and honors the maxSkew=2. Issue only seems to occur when maxSkew = 1. In other circumstances, like no TopologySpreadConstraints, cluster-autoscaler is also able to move coredns pods to other nodes and scale down.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Deployment Config (labelSelector specific for coredns)