node cannot get to state IDLE_UNSCHEDULABLE for scale in to occur

wbuchwalter / Kubernetes-acs-engine-autoscaler

[Deprecated] Node-level autoscaler for Kubernetes clusters created with acs-engine.

Other

71 stars 22 forks source link

node cannot get to state IDLE_UNSCHEDULABLE for scale in to occur #81

Closed grawcho closed 6 years ago

grawcho commented 6 years ago

I changed the logic on my fork https://github.com/grawcho/Kubernetes-acs-engine-autoscaler there is a condition in a section of scaler.py (line 102) that does not (or at least i havn't found a way to) get to line 110 so the nodes stay in state UNDER_UTILIZED_DRAINABLE and do not scale down ... after i tested my change (i left lines 110 - 113 there) the scale in works perfectly ... thanks again for the amazing work :)

wbuchwalter commented 6 years ago

I am not sure I understand what you mean. If a node is UNDERUTILIZED_DRAINABLE it will be cordoned and drained, and on the next pass it will be IDLE_UNSCHEDULABLE. Idle because all the pods have been drained and unschedulable because the node is cordoned.

I probably didn't understand the issue you are talking about though.

grawcho commented 6 years ago

my bad ... closing ... this was an oversight caused by the fact that there is a 1800 seconds default idle threshold, i double checked the logs this works fine.

grawcho commented 6 years ago

I think the issue might be that the underutilized nodes are always in busy_list (because of the kube_proxy-*) so even in the second pass it goes back to UNDER_UTILIZED_DRAINABLE and does not get to IDLE_UNSCHEDULABLE for deletion ... i'll run some test cases to validate. in any case ... this is working both ways (in and out). please let me know if i can be of any help here ... i'm trying to work on new features like mixed cluster (windows / linux) ... and other stuff

wbuchwalter commented 6 years ago

FYI, this autoscaler https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler is going to take over pretty soon. Support for acs-engine is pretty new and still in alpha state, but ultimately this where you should contribute I think. I plan on deprecating my autoscaler once cluster-autoscaler is deemed stable enough.

wbuchwalter commented 6 years ago

I now understand what you meant :) I fixed it here: https://github.com/wbuchwalter/Kubernetes-acs-engine-autoscaler/pull/82 You can try with image wbuchwalter/kubernetes-acs-engine-autoscaler:proxy. Thanks

grawcho commented 6 years ago

Cool thanks. I fixed it on my fork. Will test moving back to yours too