awslabs / data-on-eks

DoEKS is a tool to build, deploy and scale Data & ML Platforms on Amazon EKS
https://awslabs.github.io/data-on-eks/
Apache License 2.0
666 stars 227 forks source link

Binpacking support for DoEKS #614

Closed hitsub2 closed 2 months ago

hitsub2 commented 3 months ago

Community Note

What is the outcome that you are trying to reach?

For running batch jobs, like Spark workloads, we want to have higer resource utilization and cost efficiency to avoid spreading pods across nodes leading nodes can be scaled-in in time.

By default, the scheduling-plugin NodeResourcesFit use the LeastAllocated for score strategies. For the long running workloads, that is good because of high availability. But for batch jobs, like Spark workloads, this would have high cost. By changing the from LeastAllocated to MostAllocated, it avoids spreading pods across all running nodes, leading to higher resource utilization and better cost efficiency.

Describe the solution you would like

Support binpacking strategy.

Describe alternatives you have considered

Using the custom scheduler like Yunikorn or Volcano. But it is complicated in some scenarios that we only need the binpacking.

Additional context

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

github-actions[bot] commented 2 months ago

Issue closed due to inactivity.