smoother scaling (predictive scaling)

kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes

https://keda.sh

Apache License 2.0

8.33k stars 1.05k forks source link

smoother scaling (predictive scaling) #2401

Open aslom opened 2 years ago

aslom commented 2 years ago

Proposal

Currently KEDA scalers do not have an easy way to predict scaling targets and keep necessary history of measurements

Use-Case

Kafka scaler scales number of Kafka consumers. Each new scaling triggers Kafka rebalancing as Kafka broker needs to re-assign consumers to Kafka topic partitions and that can take 10 seconds or longer. During rebalancing events can not be consumed and that leads to jarring experience when scaling is repeated (as events are not consumed during re-balancing) with scaling going 1 -> 2 -> 4 -> 8 -> 16 -> ... (up to number of partitions - large topics may have hundreds of paritions)

Anything else?

The best way to explore the issue may be to build quick prototype for Kafka scaler and explore how generic prediction/history interface can be?

zroubalik commented 2 years ago

Yeah, we should try to implement this kind of stuff in Metrics Server, this is relevant issue that needs to be tackled first: https://github.com/kedacore/keda/issues/2282

For reference adding a link to HPA docs and it's scaling behavior configuration, this could help with mitigating and configuring the smoothnes of the scaling as well: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior

VerstraeteBert commented 2 years ago

I would love to see this as well in the future. A solid predictive scaling mechanic could turn KEDA into a very powerful tool, where users don't need to worry about tweaking any of the scaling parameters. e.g., a method that takes into account the number of events coming in per timespan vs historic data on the number of events processed per active replica in the same timespan. One can dream right?

Supplementary use case: serverless platforms wanting to offer fully transparent autoscaling to their users.

tomkerkhove commented 2 years ago

Relates to #197

daniel-yavorovich commented 2 years ago

@aslom

We had similar thoughts, but not specifically in the Kafka context, and created our scaler based on AI model. Take a look at how it can perform, maybe it will work for you too.

PR: https://github.com/kedacore/keda/pull/2418

tomkerkhove commented 2 years ago

Ok if we close this issue in favor of https://github.com/kedacore/keda/issues/197 @aslom?

aslom commented 2 years ago

@tomkerkhove I would liek to keep it open as I started a simple non-AI version and testing AI version to see side-by-side how they work for Kafka - my intuition is that simple version may be good for predictability and/or use together with AI version

zroubalik commented 2 years ago

Yeah, this is a little bit different approach. It would use just a short window of the last couple of metrics to do the calculation.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

aslom commented 2 years ago

Still looking into it.

aslom commented 2 years ago

/remove-lifecycle stale