Open janslow opened 8 months ago
@michaelmdresser any thoughts here?
This is well-reasoned; I wanted to add quantile algorithm support to Continuous Request Right-Sizing from the start but did not have time. The primary proposed solution is the one I endorse, though I would also be okay with the alternative.
Problem Statement
The "Continuous Request Right-Sizing" currently uses the
max
algorithm for recommendations, which causes services with high start-up CPU usage to be overprovisioned.For example, some of our services spike to ~2 cores at start-up, then drop down to ~0.3 cores when stable. This has too negative effects:
Solution Description
Introduce
cpu.request.autoscaling.kubecost.com/algorithm
andcpu.request.autoscaling.kubecost.com/q
annotations (or similar) to allow thealgorithmCPU
andqCPU
right-size recommendations parameters to be set on a per-workload basis.It probably makes sense to introduce this for memory as well, for consistency.
Alternatives
Allow arbitrary query parameters to be added to the recommendation API requests (e.g.,
request.autoscaling.kubecost.com/extraRecommendationParameters: "algorithmCPU=quantile&qCPU=0.95
)This could be useful for allowing the use of alpha/experimental parameters, without making it part of the Cluster Controller's API.
Additional Context
No response
Troubleshooting