pusher / k8s-spot-rescheduler

Tries to move K8s Pods from on-demand to spot instances
Apache License 2.0
313 stars 43 forks source link

k8s-spot-rescheduler doesn't seem to iterate through on-demand nodes and keeps trying to drain undrainable nodes #55

Open morganwalker opened 5 years ago

morganwalker commented 5 years ago

We're using kops 1.10.0 and k8s 1.10.11. We're using two separate instance groups (IG), nodes (on-demand) and spots (spot). As mentioned in https://github.com/pusher/k8s-spot-rescheduler/issues/54, the rescheduler thinks it can move all pods, fails due to a PDB, and leaves the node underutilized and tainted. And here's where my issue comes in, it keeps trying to drain that node over and over again. If it can't drain the node due to whatever reason, e.g. PDBs, availability zone conflicts (https://github.com/pusher/k8s-spot-rescheduler/issues/53), etc., it should move on to the next on-demand node and try that one.