Idea: improve buffering via faking pods

ideahitme commented 7 years ago

In my mind it is not always necessary to have a buffering enabled for the whole cluster, furthermore buffering it is not a guarantee of a smooth scale up in scenarios where the buffer is equally distributed across the large number of nodes, hence none providing enough resource to adopt a pod.

Alternative way would be to mark the pods(or deployment) with annotation indicating that this application requires fast scaling, in response to the annotation cluster-autoscaler will maintain a set of ghost pods (placeholders) having same resource request, but which can be killed at any moment and having a real pod scheduled in its place. Cluster autoscaler will detect unschedulable pod -> find if it has ghost pods and kill it. Upon the rescheduling of the real pod, fake pod needs to be recreated, and if not trigger ASG scale-up. This will provide a fast response and a guarantee that a pod will find a place to be scheduled on. This provides a control on an application level, as we can imagine not all application require fast scale-up and can tolerate dropped requests. Plus this allows to simply change the annotation to prepare for the expected load increase, as well as automate the process to scale up/down based on day time or week day by changing the number of ghost pods.

Guarantee of having the right pod getting the ghost pod slot can be achieve through taints on node, preventing from other pods getting its place.

Thoughts ?

hjacobs commented 7 years ago

I like the idea of "ghost pods" to ensure quick autoscaling (or deployment) of certain apps. You are right, the current percentage based buffer does not ensure that a slot for a critical app is really available.

ideahitme commented 7 years ago

yeah, and of course configmaps are a way to go to specify the number of "ghost pods" instead of annotations :)

hjacobs / kube-aws-autoscaler

Idea: improve buffering via faking pods #14