Closed braunsonm closed 8 months ago
Hi @braunsonm not sure I understand why this is a bug since there are some information missing. There are some conditions explained in this article https://github.com/knative/docs/pull/5709 on how the transition from proxy mode to serve is calculated. Could you paste please the full log (debug mode) fromt he autoscaler so we can debug and see if there is actually a problem?
It's difficult for me to provide a full log in my environment.
I'm not sure what you mean by you're not sure if it's a bug. If the number of available instances is greater than zero, and the number of requests is less than the burst capacity, then should knative not switch to serve mode as your docs state?
Instead what you observe is Knative thinks it needs to protect the instance by putting the activator in between the request path (burst capacity).
Am I misunderstanding the burst capacity property?
Reading your doc specifically I think perhaps my confusion comes down to two areas:
1) If I'm understanding right it will not be possible for a default configured service to enter serve mode until 3 instances are running because the default target
is 100
and the default target-burst-capacity
is 211
. This means EBC = 100 - 0 - 211
when 1 instance is running. This does not explain why my logs were showing ebc=-211
though. This seems very counter intuitive especially for low traffic services.
2) I'm a bit confused by this in your docs:
a request needs to stay around for some time in order concurrency metrics to show enough load, it means you can't get EBC>=0 with a hello-world example. The latter which is often confusing for the newcomer as the Knative service never enters serve mode. In the next example we will show the lifecycle of a Knative service that also moves to serve mode and how ebc is calculated in practice.
Why can't you get EBC>=0 with a hello-world example? If anything wouldn't there be a lot of excess burst capacity available because the number of recorded requests are zero?
Why can't you get EBC>=0 with a hello-world example? If anything wouldn't there be a lot of excess burst capacity available because the number of recorded requests are zero?
It is how statistics are counted in a time window. If you want to see a difference you need to use the autoscale sample with enough sleep time: https://knative.dev/docs/serving/autoscaling/autoscale-go.
It is counter intuitive unless you understand the concepts (I am trying to add that missing part with that blog), and honestly we haven't done a good job in the past explaining in detail, but this is how the autoscaler is designed. You need to have enough concurrency for a period of time if your request is too quick then it does not add much. @dprotaso correct me if I am wrong here.
The log you have shows that there are no active replicas plus observedPanicValue=11 and TargetBurstCapacity=200 (default) . Thus EBC = -211 see also here. That is my understanding given that you cant provide more logs.
To answer your question if you dont set a low TBC value it means you need to have enough pods to deal with a burst if you dont want activator to interfere. That is also explained in the blog where I set TBC=10. While you have EBC <0 it means you need activator.
Thank you @skonto
I'm not sure why I saw ebc=-211
since I did have an active replica (would expect -111
) but I can see the reasoning behind how the autoscaler was designed.
I think in the majority of low traffic cases, a target burst capacity can be much lower than the current default so that it switches to serve mode sooner than after 3 replicas but we will make that change on our platform instead.
I look forward to your PR being merged as it would have answered a lot of these questions for me. I think some of the existing docs are a bit confusing as they completely leave this out and lead you to believe it will switch to serve when more than zero instances are ready. I'll close this issue since you answered my questions well.
/area autoscale
What version of Knative?
1.13.1
Expected Behavior
On low traffic volume, the SKS should be put into
Serve
mode as theKnativeService
can handle the volume of traffic.Actual Behavior
The SKS is never transitioned to
Serve
mode fromProxy
mode because it thinks that all the available burst capacity has been used up.We see the following log:
You can "fix" this issue by changing the services target burst capacity to
0
. This will switch toServe
mode correctly.Observations
Even as a test I set
autoscaling.knative.dev/target-burst-capacity: "999999999"
the same error message will remain stating that all the excess burst has been used up.However if I set my
autoscaling.knative.dev/target-burst-capacity: "100"
, or any integer <=100 it will enter serve mode. Anything higher will never be able to enter serve mode. Maybe this is linked to the default container concurrency soft limit?Steps to Reproduce the Problem
httpbin
to servingkn service create httpbin --image mccutchen/go-httpbin:v2.13.2 --port 8080
. This creates aksvc
withcontainerConcurrency
set to zero.Serve
mode