Closed patrickshan closed 5 years ago
This is caused by usage percentage calculation algorithm. Because rather than counting any daemonsets or static pods requests and removing them from final calculation, it currently uses Total pods requests without daemonsets/static pods
and Total nodes allocatable resource
. This makes the usage percentage a bit smaller than its real usage percentage.
One way to solve this issue without changing escalator is to tune your node group config parameters, especially scale_up_threshold_percent
, taint_upper_capacity_threshold_percent
and taint_lower_capacity_threshold_percent
.
updated the document in this PR: https://github.com/atlassian/escalator/pull/156 . Close this issue for now as it could be solved by tuning escalator config.
Because of the way escalator calculates its resource usage percentage, escalator couldn't scale up node group if daemonsets pods on the node request a significant proportion of node resource. Node group has only one node A which has allocatable resources like this:
Pods scheduled on node A:
There are 3 replicas under test deployment and the first two have been scheduled on node A while the last one under
Pending
state:And escalator doesn't trigger any scale-up in this case: