palantir / k8s-spark-scheduler

A Kubernetes Scheduler Extender to provide gang scheduling support for Spark on Kubernetes
Apache License 2.0
176 stars 42 forks source link

Work with Cluster Autoscaler? #158

Open aleclerc-sonrai opened 3 years ago

aleclerc-sonrai commented 3 years ago

Hi I'm trying to figure out if the k8s-spark-scheduler work in conjunction with the cluster-autoscaler.

My scenario is that I'm trying to have a node-pool that is small, but when I submit spark jobs, it can grow (up to a point) and the nodes will increase, the job will be fulfilled, and then cluster-autoscaler would scale it back down.

onursatici commented 3 years ago

Hey @aleclerc-sonrai, sorry for the late reply here. From my latest knowledge about this cluster-autoscaler would treat pending pods and number of rescheduled pods as signal for scaling decisions. k8s-spark-scheduler will not prevent creation of executor pods, so it should work with cluster-autoscaler. One constraint that is k8s-spark-scheduler specific is instance-groups, which are labels in your nodes that are used to create logical node groups. You might be better of using a single instance group for the nodes for your spark applications to prevent any potential disagreement between the cluster-autoscaler and k8s-spark-scheduler.

Potentially scale up demand can be used faster by an autoscaler if it can process the Demand CRD objects created by k8s-spark-scheduler. In that case, all resource requests for a spark application will be reflected in this object rather than for the autoscaler wait for all executor pods to be created. This can add up to a minute to propagate scale up demand to the autoscaler