ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
34.12k stars 5.79k forks source link

[Docs] Ray + Slurm Container Orchestration #47875

Open destin-v opened 1 month ago

destin-v commented 1 month ago

Description

As a Reinforcement Learning (RL) researcher I enjoy using Ray for various projects. However, Ray had limited community support for the Slurm architecture (which my university uses). So I built my own container orchestration solution for Ray on Slurm.

I recently published a paper in IEEE describing a method to perform container orchestration of Ray on Slurm. Benchmarks of RLLib and Mujoco environments are provided here. The official website is found here.

I would like to get the website and paper linked to the Ray official documentation for Slurm. I think it will be helpful to other researchers who want to utilize Ray on Slurm.

Link

No response

jjyao commented 1 month ago

@destin-v feel free to open a doc PR and tag me.