Open rhuss opened 4 years ago
Just to add another wrinkle into it... not having the same names as what people would put into the yaml has caused me some mental pain :-) it makes it harder to switch between the two worlds. Thinking/remembering is hard.
I totally agree that we should align where it is possible, but the concurrency options are a bit 'special', as they are sprinkled all over the YAML (as annotations and different fields). I see the value of the CLI to allow a use to not remember the full annotation.
I think we should use the given name where possible, but deviate where needed. E.g. I really would love to have all concurrency/scaling related parameters together in the 'option namespace' (i.e. appearing together on the help page, so using the same prefix).
According to Configuring Autoscaling
we should probably better move to --autoscale-
prefix and deprecated --concurrency-
(so, support it for the next releases, too but move to a uniform prefix). I think that makes sense if we want to support even more autoscale related parameters. The per-revision parameters that possibly make sense to support are:
--autoscale-class
--autoscale-metric
--autoscale-target
(deprecate --concurrency-target
)--autoscale-limit
(deprecate --concurrency-limit
)--autoscale-utilization
(deprecate --concurrency-utilization
)--autoscale-rps
(requests-per-second) <--- I think we can skip that as it's not clear how this differs to --autoscale-target
--autoscale-min
(deprecate --min-scale
. Does this make sense ?)--autoscale-max
(deprecate --max-scale
. Does this make sense ?)--autoscale-window
--autoscale-panic-window
--autoscale-panic-threshold
For backwards compatibility reason we could stick to the current main concurrency related config:
then deprecate --autoscale-window
and --concurrency-utilization
and introduce a --autoscale
parameter for configuring less used autoscaling parameters like in --autoscale window=10s,utilization=70,panic-window=10
with the alternative syntax --autoscale window=10s --autoscale utilization=70 --autoscale panic-window=10
(the same what we already support for array-typed options like --env
or --annotation
). For sake of consistency we then can add also min
, max
, limit
and target
to the key --autoscale
subconfig keys.
The benefit of this approach:
--concurrency-utilization
and --autoscale-window
would be affected.I think, I like this suggestion. wdyt ?
I'd hold off until they're done with that doc and resulting PRs. I'm still hoping they'll rename stuff as a result of that work.
In particular, --autoscale-limit
is not a great alternative to container-concurrency
. As a user I would guess that --autoscale-limit
is about max scaling of instances, not the amount of concurrency allowed per instance
According to Configuring Autoscaling
@rhuss can I please get access to this doc ?
We sorted out the issue. FYI you need to subscribe to https://groups.google.com/forum/#!forum/knative-dev in order to access Knative documentations ....
FWIW: I would vote against --autoscale-limit
definitely, it's not even really an autoscaling parameter. It's more of a loadbalancing/enforcement paramter for the revision which is just inferred into target
by the autoscaler, because that makes sense from its PoV. I'd not prepend that with autoscale
.
FWIW: I would vote against
--autoscale-limit
definitely, it's not even really an autoscaling parameter. It's more of a loadbalancing/enforcement paramter for the revision which is just inferred intotarget
by the autoscaler, because that makes sense from its PoV. I'd not prepend that withautoscale
.
@markusthoemmes so you are saying that the "limit" and the "target" are not belonging to the same group (like kind of imposed by the syntax "autoscaling.knative.dev/target"), where on is connected to autoscaling and the other not ? If so, then I think the analogy to 'soft' vs 'hard' limit is a bit misleading then, too (limit of what ?)
So its should be --concurrency-limit
and --autoscale-target
? (or would be --concurrency-target
also ok ?)
Yeah, it's a little involved.
The autoscaler only knows about the target. If no explicit target is set, it is derived from the concurrency limit (aka containerConcurrency). I agree, the analogy might be a bit weird, it explains the behavior from the user's pov though.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle-stale /reopen
We definitely still need on this refactoring. E.g we have --autoscale-window
but --scale-min
. Technically it should be probably all --autoscale
but --scale
is shorter.
Also we are still missing some parameters, like the configuration options for the panic window.
@rhuss: Reopened this issue.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
@rhuss any timeline for plans to work on this one? I think refactoring these would give a much nicer UX.
:thinking: it seems to be done already.
generation, and {{.Random [n]}} for n random consonants (e.g. {{.Service}}-{{.Random 5}}-{{.Generation}})
--scale string Set the Minimum and Maximum number of replicas. You can use this flag to set both to a single value, or set a range with min/max values, or set either min or max values without specifying the other. Example: --scale 5 (scale-min = 5, scale-max = 5) or --scale
1..5 (scale-min = 1, scale-max = 5) or --scale 1.. (scale-min = 1, scale-max = unchanged) or --scale ..5 (scale-min = unchanged, scale-max = 5)
--scale-init int Initial number of replicas with which a service starts. Can be 0 or a positive integer.
--scale-max int Maximum number of replicas.
--scale-min int Minimum number of replicas.
--scale-target int Recommendation for when to scale up based on the concurrent number of incoming request. Defaults to --concurrency-limit when given.
--scale-utilization int Percentage of concurrent requests utilization before scaling up. (default 70)
--scale-window string Duration to look back for making auto-scaling decisions. The service is scaled to zero if no request was received in during that time. (eg: 10s)
There's a PR to add --scale-metric
, but that's current state of scaling flags. I can look up since which release it's changed, if it helps.
Since we have collected all concurrency related options with the option prefix
--concurrency-
and also list them together inkn service describe
, the autoscale window parameter should be also in this group. So either move everything to--concurrency-
or, move everything (limit, target, utilization, window) to--autoscale
.I'm not sure in which direction to go, moving to
--concurrency
is easier as most of the options are already--concurrency
prefixed, on the other "concurrency" is quite tedious to type, so maybe 'autoscale-' is a better prefix? But we might even consider a total different prefix as well?