The Kubernetes community has been working on the auto scaler for a long time. They have had the HorizontalPodAutoScaler GA and provided a CRD and the controller of the VerticalPodAutoScaler. These auto-scalers targets the workload resources such as Deployment and StatefulSet. But we will scale the RisingWave instance, which contains four components and multiple groups of workload resources. Therefore, there're two options for implementing the auto-scaling:
Leverage the pod auto-scaler provided by Kubernetes and ignores the replicas defined in the RisingWave spec when the auto-scalers are enabled. We need to define the behavior when auto-scaler is enabled/disabled and ensure there're no ambiguities.
Define new CRs ourselves and re-implement the trigger and scaling progress by operating the RisingWave resources directly. This lets us have more control over the progress, e.g., we can have a customized policy for triggering the scaling, and there will be no ambiguity, but we need to do a lot of work that the community has done.
Either option makes sense to me. Since we don't have much effort, I think we have to decide what to do.
Since HPA is now supported and VPA is still under investigation, I will close this issue in favor of another issue to track the VPA support if necessary.
The Kubernetes community has been working on the auto scaler for a long time. They have had the
HorizontalPodAutoScaler
GA and provided a CRD and the controller of theVerticalPodAutoScaler
. These auto-scalers targets the workload resources such as Deployment and StatefulSet. But we will scale the RisingWave instance, which contains four components and multiple groups of workload resources. Therefore, there're two options for implementing the auto-scaling:Either option makes sense to me. Since we don't have much effort, I think we have to decide what to do.
References: