wbuchwalter / Kubernetes-acs-engine-autoscaler

[Deprecated] Node-level autoscaler for Kubernetes clusters created with acs-engine.
Other
71 stars 22 forks source link

Feature request: consider anti-affinity rules when calculating the required number of nodes #65

Open jtv8 opened 6 years ago

jtv8 commented 6 years ago

A common pattern in high availability applications is to set anti-affinity rules so that all the pods of an application run on different nodes: see https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#never-co-located-in-the-same-node

Currently the autoscaler does not know about these rules and so assumes that pods can be scheduled, despite repeatedly appearing as pending in the logs. See example below:

2017-11-16 17:08:46,490 - autoscaler.cluster - INFO - ++++ Scaling Up Begins ++++++
2017-11-16 17:08:46,490 - autoscaler.cluster - INFO - Nodes: 1
2017-11-16 17:08:46,490 - autoscaler.cluster - INFO - To schedule: 2
2017-11-16 17:08:46,491 - autoscaler.cluster - INFO - KubePod(cp, cp-cp-kafka-1) fits on k8s-k8sagent-24421710-0
2017-11-16 17:08:46,491 - autoscaler.cluster - INFO - KubePod(cp, cp-cp-zookeeper-1) fits on k8s-k8sagent-24421710-0
2017-11-16 17:08:46,491 - autoscaler.cluster - INFO - Pending pods: 0
2017-11-16 17:08:46,491 - autoscaler.cluster - INFO - ++++ Scaling Up Ends ++++++

In a future release, it would be helpful to consider one of the two options:

  1. Implement checking of the spec.affinity.podAntiAffinity definition when determining whether pod can be scheduled, or
  2. Allow a reasonable timeout for all pending pods, then try to provision a new node for them and check whether that resolves their pending state.
sebastianfromearth commented 6 years ago

We use spec.nodeSelector labels to accomplish the same ends and the same behaviour occurs. Maybe point 1. could be extended to also include checking of nodeSelector labels for scheduled pods?

edit: I just saw this one: https://github.com/wbuchwalter/Kubernetes-acs-engine-autoscaler/issues/41 my bad, i think my comment relates more to this other one