Open josh-ferrell opened 3 months ago
Thanks for opening your first issue here! Be sure to follow the issue template!
This is a duplicate of https://github.com/kyverno/kyverno/issues/4070
Duplicating PR comment. @chipzoller I read through https://github.com/kyverno/kyverno/issues/4070 and it looks like much of the feedback was prior to splitting the controllers. This HPA is specific to the admission-controller pods and the load coming from the webhook calls. I think it was agreed based on https://github.com/kyverno/kyverno/issues/4070#issuecomment-1171656739 that scaling of the admission-controller is beneficial.
Closed https://github.com/kyverno/kyverno/issues/4070 - we can use this issue for HPA support for the admission controller.
Any plans to close this one ? Seems like an easy addition and there is a PR already.
Will be happy if we can have this soon, thanks.
Problem Statement
CRUD operations on kyverno objects can be impeded if load on the admission-controllers causes a response to exceed the configured timeout of 10 seconds. The maximum configurable timeout of a ValidatingWebhook is 30 seconds.
Solution Description
Add an optional HorizontalPodAutoscaler for the admission controller to the helm chart that can scale the admission-controller based on CPU utilization to keep response times under the webhook timeout.
Alternatives
End users manually scale based on alerting or when they anticipate heavy CRUD operations for kyverno objects.
Additional Context
Load induced failure of webhook calls are most likely rare outside of CI/CD pipelines where Kyverno objects may be created or updated at scale.
Slack discussion
No response
Research