Concurrecy Setting : Hard limit and Soft limit

knative / serving

Kubernetes-based, scale-to-zero, request-driven compute

Apache License 2.0

5.46k stars 1.14k forks source link

Ask your question here:

I'm referencing two documents. The one is Official docs : https://knative.dev/docs/serving/autoscaling/concurrency/#soft-versus-hard-concurrency-limits

and the other one is Configmap's commnet https://github.com/knative/serving/blob/main/config/core/configmaps/autoscaler.yaml

I'm confusing when it comes to use both Hard limit and Soft limit.

Official docs says it will follow smaller one, If both a soft and a hard limit are specified, the smaller of the two values will be used. This prevents the Autoscaler from having a target value that is not permitted by the hard limit value.

but on the configmaps's comment, it said # When revision explicitly specifies container concurrency, that value will be used as a scaling target for autoscaler

which one is correct?

knative / serving

Concurrecy Setting : Hard limit and Soft limit #15216

Ask your question here: