ovh / public-cloud-roadmap

Agile roadmap for OVHcloud Public Cloud services. Discover the features our product teams are working on, comment and influence our backlog.
https://www.ovhcloud.com/en/public-cloud/
185 stars 5 forks source link

Enhanced autoscaler capabilities (ressources requirement aware) #567

Open antonin-a opened 4 months ago

antonin-a commented 4 months ago

As customer I would like to be able to use a node autoscaler so that I can create new node that better match resource requirement in terms of CPU or RAM and define more advanced scaling policies than with the existing Kubernetes Cluster Autoscaler. An idea would be to rely on Karpenter autoscaler (https://karpenter.sh/)

 

Karpenter is an AWS project that has been donated to CNCF/Kube project in nov. 2023 (https://twitter.com/dims/status/1727388443595137348)

 

Feature | Cluster Autoscaler | Karpenter -- | -- | -- Resource Management | Based on the resource utilization of existing nodes, Cluster Autoscaler takes a reactive approach to scale nodes. | Based on the current resource requirements of unscheduled pods, Karpenter takes a proactive approach to provisioning nodes. Node management | Cluster Autoscaler manages nodes based on the resource demands of the present workload, using predefined autoscaling groups. | Karpenter scales, provisions, and manages nodes based on the configuration of custom Provisioners. Scaling | Cluster Autoscaler is more focused on node-level scaling, which means it can effectively add more nodes to meet any increase in demand. But this also means it may be less effective in downscaling resources. | Karpenter offers more effective and granular scaling functionalities based on specific workload requirements. In other words, it scales according to the actual usage. It also allows users to specify particular scaling policies or rules to match their requirements. Scheduling | With Cluster Autoscaler, scheduling is more simple as it is designed to scale up or down based on the present requirements of the workload. | Karpenter can effectively schedule workloads based on different factors like availability zones and resource requirements. It can try to optimize for the cheapest pricing by choosing optimized flavors (CPU or RAM) but is unaware of any commitments like RI’s or Savings Plans.
pgillet commented 1 month ago

Hello here, we have a Kubernetes cluster with multiple nodepools, one of which is configured with autoscaling enabled. It can scale up from 0 to 3 nodes. It works well but we observe that a node is only ready after 4 or 5 minutes, which is too long: we execute Knative functions whose execution time is much shorter, with scenarios like burst/batch scale. In my understanding, the cluster autoscaler is triggered as soon as the cluster cannot schedule a new pod, with new-pod-scale-up-delay = 0 seconds.

As far as I understand, there is no possible setting that would make the "default" Cluster Autoscaler proactive, that is, it would anticipate the need for more resources, and therefore it would provision nodes in advance (like with Horizontal Pod Autoscaling based on resource metrics (CPU, mem). Ex: create a new pod replica when you reach 80% CPU)

Thus, Karpenter would be the solution to our use case.

Thank you @yomovh for pointing me out the roadmap and Karpenter.

antonin-a commented 1 month ago

Hello @pgillet, thank your for your feedback and upvote ! I do confirm that Karpenter is on our roadmap, no clear ETA yet but we will keep you updated here.

clement-igonet commented 1 month ago

Could you fix these links please (parenthese at the end) ?

clement-igonet commented 1 month ago

Karpenter is something that could be installed on any (OVH) k8s cluster ? https://karpenter.sh/v0.37/getting-started/getting-started-with-karpenter/