knative / serving

Kubernetes-based, scale-to-zero, request-driven compute
https://knative.dev/docs/serving/
Apache License 2.0
5.56k stars 1.16k forks source link

Autoscaler scales up a ksvc for no good reasons #6743

Closed JRBANCEL closed 4 years ago

JRBANCEL commented 4 years ago

What version of Knative?

v0.12.0-86-gbb1415ea8-dirty No ConfigMap was tweaked.

Expected Behavior

I am working on reaching 1M QPS. I have a simple Vegeta code that sends requests to the hello world example. Every 30s, the QPS increase by 100. Every 1k increase, the QPS is maintained for 2 minutes to evaluate stability.

I am expecting the ksvc to gradually scale, one pod at a time and the latency to stay low. In this case, I am not even expecting scaling since we are talking about only a few thousands QPS.

Actual Behavior

Randomly, Autoscaler goes into panic mode and scales the revision for no good reason and then unpanics and goes back to where it was before. QPS Dashboard Autscaler Dashboard

With @vagababov, we found out in the logs that ObservedStableValue goes up ~2x before the panic, then comes back to value it was before. @vagababov suggested it could be double counting because of the bucketing.

vagababov commented 4 years ago

Since it's my doing, I'll try to fix it. /assign

JRBANCEL commented 4 years ago

@vagababov, the code used to generate the load: https://github.com/JRBANCEL/Experimental/blob/master/KnativeBenchmarking/cmd/vegeta/main.go

vagababov commented 4 years ago

The latest changes should have dealt with it. @JRBANCEL can you verify you no longer see this when running Knative from HEAD?

vagababov commented 4 years ago

/close

knative-prow-robot commented 4 years ago

@vagababov: Closing this issue.

In response to [this](https://github.com/knative/serving/issues/6743#issuecomment-590686106): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.