planetscale / vitess-operator

Kubernetes Operator for Vitess
Apache License 2.0
305 stars 75 forks source link

Autoscaling and policy-driven automations #259

Open christosnc opened 2 years ago

christosnc commented 2 years ago

Hello everyone, πŸ˜€

This as a proposal, and a place to discuss about the implementation of autoscaling and policy-driven automations for Vitess. # The general idea is to be able to provide a list of policies / rules (possibly in the spec) for certain events / actions to take place automatically. This would be very useful for specifying custom autoscaling scenarios, or alerts, for example.

The high-level approach to this could be:

  1. We create an "orchestrator" server that takes metrics from our Vitess clusters.
  2. We create some "policies" on when / how to scale up and/or down (based on metrics and limits). Also, we specify the frequency of the check for each policy.
  3. The server checks at the given intervals for each policy and if applicable, runs custom predefined actions to our Vitess clusters.

To be able to achieve this, we need to be able to specify the following info in the spec for any policy:

All this could be tremendously useful, allowing for custom autoscaling (horizontal and vertical), alerts, reports, integrations, and automated backups.

Please give your thoughts and ideas!

# This is a followup for a Slack discussion. Please check it out for more info.

matthiasr commented 2 years ago

To what extent can this be done using Kubernetes primitives already? For example, while we haven't done it yet, we have thought about autoscaling vtgate on the number of concurrent queries it is handling (we find this more predictive of load, and more stable for what is mostly a proxy, than CPU usage).

For Kubernetes Deployments, this can be done using Horizontal Pod Autoscaler or KEDA. What would it take to expose the necessary knobs to something like KEDA? How could such a setup be made more accessible (examples? documentation?)