kubernetes / autoscaler

Autoscaling components for Kubernetes
Apache License 2.0
7.82k stars 3.87k forks source link

VPA: When targetRef is a Rollout, VerticalPodAutoscalerCheckpoint history is reset during deployment #6730

Open kodmaskinen opened 2 months ago

kodmaskinen commented 2 months ago

Which component are you using?: vertical-pod-autoscaler

What version of the component are you using?: Component version: 1.0.0

What k8s version are you using (kubectl version)?: 1.29.1

kubectl version Output
$ kubectl version
Client Version: v1.29.4
Server Version: v1.29.1-eks-508b6b3

What environment is this in?: EKS

What did you expect to happen?: I expect the VPA to retain the history from earlier versions of the same Rollout.

What happened instead?: VPA deletes the history from the VerticalPodAutoscalerCheckpoint during deployment of a new version using Argo Rollouts, which often means that the memory target is initially set to low which causes unnecessary OOM situations.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: This may be related to the issue mentioned in #5598.

voelzmo commented 2 months ago

Yeah, so deleting the VPACheckpoints is 100% related to what I described in #5598:

Hope that explains it a bit.

In general, it seems that the way how Rollouts are designed, it is pretty incompatible how VPA works currently. I guess that's also one of the reasons, why e.g. knative doesn't have VPA support: it is pretty hard to integrate with the process of rolling out new versions by first creating the Pods and only later on switching and updating the selector.

voelzmo commented 2 months ago

/remove-kind bug /kind support

kodmaskinen commented 2 months ago

Thanks for the explanation!

It seems to me like it would work if VPA treated a Rollout more like a Deployment and used the .spec.Selector instead of the .status.Selector. It would, however, need to handle the case where a Rollout references a Deployment in .spec.workloadRef, and in that case get the selector from the .spec.Selector of the Deployment.

adrianmoisey commented 16 hours ago

/area vertical-pod-autoscaler