Open lkishalmi opened 5 months ago
Hello,
Currently, you can do it using scalingModifiers
. If you include something like min(your_max_value, max(scalers))
you can limit the max value using the formula that you want to calculate the max value
Yes. Though unfortunately scalingModifiers
do not work with traditional cpu
or memory
triggers.
Interesting point! Let see other @kedacore/keda-contributors thoughts
Also, AFAIK changing the max value is kind of stronger rule, than an utilization based metric. Usually when we detect some kind of saturation in the system, we do not really care about Scale up/down policies/stabilization windows etc. The auto-scaler just shall drop some replicas if they were over the new limit.
Reaching maxReplicaCount is often treated as an alarm-worthy event.
Dynamically changing maxReplicaCount in order to scale out would make the alarm worthless.
Well, for those who have workloads where reaching the maxReplicaCount
could be alarming, probably should not use dynamically changing maxReplicaCount
.
We have a bunch of queue processing workloads working against 2-3 centralized backend systems. it's reallyhard to determine the correct max replicas / deployment across all the system. We are working with guesstimates, though we have good indicators when some central systems start to be saturated. Dropping some not that important workload would serve us well.
@zroubalik @tomkerkhove ?
Reaching maxReplicaCount is often treated as an alarm-worthy event.
Dynamically changing maxReplicaCount in order to scale out would make the alarm worthless.
If you choose to dynamically define them, then I think it's safe to say you opt in for the behavior and should be OK.
I suspect that HPA is not designed to be used in this way.
What I do see is that a lot of code is bypassed when current replicas exceeds max replicas. This code implements the customizable behaviors, including the default behavior, which may result in erratic scaling:
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#default-behavior
https://github.com/kubernetes/kubernetes/blob/HEAD/pkg/controller/podautoscaler/horizontal.go#L822
Well, the code bypass would be a kind of desired behavior. I do not think anyone would miss scale down behaviors modifiers in those cases. In our workload have the following use cases:
I would be a little bit cautios here, for reasons mentioned above and also I am not sure if KEDA operator is the right actor to modify the min/max and HPA settings on the fly. TBH I am not sure the added complexity would be beneficial, we are always trying to keep KEDA simple, to do the one job correctly.
Hmm. Might need to write another orchestrator that would manage ScaledObjects/HPA min/max setting using triggers...
@lkishalmi yeah, that would be a better direction, IMHO.
Proposal
There are certain resource saturation situations which would could restrict scaling out.
While these situations sometimes can be incorporated into a
formula
inscalinModifiers
, that could result a fairly complex one especially when we already have multiple triggers.I'd suggest that the
maxReplicaCount
of theScaledObject
could be an int or string, if it would not parse an integer, then it would be evaluated as a formula. The trigger definitions could be used as a source for the calculation.Use-Case
We have several worker processes, that read/writes data between multiple datastores. (Redis, MySQL).
We are processing queues, and would like to process them fast enough. However if all workers are doing jobs everywhere, than some of the systems could get saturated and eventually broke.
At the moment we have empirically set maximum values, that are trying to save our backend systems from saturation. Still that happen from time-to-time.
We have good metrics/indicators to detect saturation. We have an alerting system in place. When the alert happens we manually set the maximum number of replicas. Unfortunately the time between the saturation detection and the scale down is really crucial, we need to act in 2-5 minutes. Sometimes we could not act that fast.
It would be good to automate these evaluations and actions. We have already have our scalers in KEDA, it controls our HPAs.
The saturation metris/indicators could be collected with existing KEDA scalers (triggers).
Is this a feature you are interested in implementing yourself?
No
Anything else?
No response