ray-project / kuberay

A toolkit to run Ray applications on Kubernetes
Apache License 2.0
1.16k stars 373 forks source link

[Feature] KubeRay Scalability Benchmarking #2069

Open andrewsykim opened 6 months ago

andrewsykim commented 6 months ago

Search before asking

Description

In last week's KubeRay community meeting we discussed kicking off some work to benchmark KubeRay and Ray on different aspects of scalability.

The end result should be something like:

  1. Create a simple tool to create Kubernetes clusters, RayCluster and run some benchmarking tests
  2. Published benchmark results based on the tests run

As a bonus step, it would be great to setup periodic runs of scalability tests to catch possible regressions in performance.

As a starting point I would like to propose the following metrics to measure:

Are you willing to submit a PR?

andrewsykim commented 6 months ago

For reference some work has been done in this area already but it primarily focuses on memory scalability https://docs.ray.io/en/latest/cluster/kubernetes/benchmarks/memory-scalability-benchmark.html#kuberay-mem-scalability

kevin85421 commented 5 months ago

cc @morhidi

andrewsykim commented 5 months ago

This week is Google Cloud NEXT, but @kevin85421, @morhidi and I plan to meet some time next week to kick off this work.

If you have any ideas or feedback on what areas of scalability you would like us to test, please leave a note in this issue.