Describe the bug
AKS CA scaling down is not working properly with Pod Topology Spread Constraints enabled when all available nodes already in use by given deployment.
Assuming there is a node eligible for scaling down (due to low resources allocated), it shouldn't be taken into consideration for Pod Topology Spread Constraints calculations while checking whether the pods can be rescheduled on another nodes (by CA 'simulating' kube-scheduler).
Currently, CA is not scaling down with the following message - 'nodeA is not suitable for removal: can reschedule only 0 out of 1 pods'
To Reproduce
Steps to reproduce the behavior:
Let's assume there are 3 nodes
nodeA
nodeB
nodeC
Let's assume there is deployment with 3 pods scheduled as below.
Pod
Node
xyz-1
nodeA
xyz-2
nodeB
xyz-3
nodeC
Above deployment has Pod Topology Spread Constraints enabled.
nodeC is a candidate for scaling down due to low resources allocated (below the CA threshold). However, scaling down never happens.
If I'm not mistaken, the autoscaler checks whether it is possible to reschedule pods on other nodes before scaling events. This mean pod xyz-3 would need to be able to be scheduled on nodeA or nodeB. However, seems it violates Pod Topology Spread Constraints maxSkew, as nodeC is not skipped in calculations. Distribution of pods after scaling down could look like this:
Node
Number of pods
nodeA
1
nodeB
2
nodeC
0
The skew (=2) for above spread is bigger than maxSkew(=1) and node cannot be scaled down.
Autoscaler logs:
node nodeC is not suitable for removal: can reschedule only 0 out of 1 pods
Auto scaling (in this particular case) works fine with no Pod Topology Spread Constrains enabled.
Expected behavior
nodeC is skipped in CA kube-scheduler-simulated calculations allowing to scale down. In this case the distribution of pods after scaling operations can looks like below. The skew in this case is equal to 1.
Describe the bug AKS CA scaling down is not working properly with Pod Topology Spread Constraints enabled when all available nodes already in use by given deployment.
Assuming there is a node eligible for scaling down (due to low resources allocated), it shouldn't be taken into consideration for Pod Topology Spread Constraints calculations while checking whether the pods can be rescheduled on another nodes (by CA 'simulating' kube-scheduler).
Currently, CA is not scaling down with the following message - 'nodeA is not suitable for removal: can reschedule only 0 out of 1 pods'
To Reproduce Steps to reproduce the behavior:
The skew (=2) for above spread is bigger than maxSkew(=1) and node cannot be scaled down.
Autoscaler logs: node nodeC is not suitable for removal: can reschedule only 0 out of 1 pods
Auto scaling (in this particular case) works fine with no Pod Topology Spread Constrains enabled.
Environment: