VPA use rollout restart instead of eviction

akloss-cibo commented 3 months ago

Which component are you using?:

vertical-pod-autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

The VPA cannot scale a Deployment with certain PodDisruptionBudgetConfigurations. A trivial example is a Deployment with a single replica, and a PodDisruptionBudget that doesn't allow an interruption. The vpa-updater tries to evict the single Pod for the Deployment, but fails to do so as the eviction will violate the PodDisruptionBudget. However, a rollout restart for the Deployment is possible, and would allow a resource configured like this to scale automatically.

Describe the solution you'd like.:

Use rollout restart for deployments when a PDB is preventing an eviction.

Describe any alternative solutions you've considered.:

Additional context.:

adrianmoisey commented 3 months ago

/area vertical-pod-autoscaler

ialidzhikov commented 2 months ago

The VPA cannot scale a Deployment with certain PodDisruptionBudgetConfigurations. A trivial example is a Deployment with a single replica, and a PodDisruptionBudget that doesn't allow an interruption. The vpa-updater tries to evict the single Pod for the Deployment, but fails to do so as the eviction will violate the PodDisruptionBudget.

@akloss-cibo, this is a misconfigured PDB. I assume that you might have hit the following K8s bug: https://github.com/kubernetes/kubernetes/issues/72320. For a long period of time there was a bug in K8s that does no allow eviction of unhealthy Pod. See https://kubernetes.io/docs/tasks/run-application/configure-pdb/#unhealthy-pod-eviction-policy. So, you might be hitting this issue. A fix is to .spec.unhealthyPodEvictionPolicy=AlwaysAllow if this is relevant for your workload. See https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/3017-pod-healthy-policy-for-pdb/README.md to properly understand what was the issue and how it gets fixed with .spec.unhealthyPodEvictionPolicy=AlwaysAllow.

The PDB motivation itself is not enough for such feature request. However, the feature request seems interesting and I assume some of the VPA maintainers can shred more light - I believe, it was a design decision to use the eviction + mutation approach over modifying the high level controller (Deployment, StatufulSet, ...) resources.

akloss-cibo commented 2 months ago

No, not hitting https://github.com/kubernetes/kubernetes/issues/72320

https://github.com/prometheus-operator/prometheus-operator/issues/6291 seems to imply that the VPA does understand the controller relationship.

Other challenges we have with the VPA, when it is working as designed, are that you lose capacity right before the scale up happens; the worst case is you have two replicas, and the VPA decides to scale-up and evicts one; now you are at 50% of your pre-scale-up capacity until the new larger pod shows up.

We've been experimenting with this implementation of this feature with some success:

last_recommendation="none"
while (true) do
  recommendation="$( kubectl get vpa the_vpa -o template  --template='{{"{{index (index .status.recommendation.containerRecommendations 0) \"target\"  }}"}}' )"
  if test "$recommendation" != "$last_recommendation" ; then
    printf "Changing recommendation from %s to %s\n" "$last_recommendation" "$recommendation"
    kubectl rollout restart deployment the_deployment
    kubectl rollout status deployment the_deployment
    last_recommendation="$recommendation"
  fi
  sleep 60
done

Even this very naive implementation seems to work pretty well. It is naive because of the polling, checking only one container's recommendation, and assuming a Deployment rather than the other possible controllers, but it illustrates the idea.

kubernetes / autoscaler

VPA use rollout restart instead of eviction #6882