thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.
https://tianshou.org
MIT License
7.48k stars 1.09k forks source link

how to run RL using multi-nodes in cluster #1133

Open HYB777 opened 2 weeks ago

HYB777 commented 2 weeks ago

How to use RayVecEnv in cluster? I want to run my rl code using multi-nodes training, I'm new to ray, is there some demos scripts?

MischaPanch commented 2 weeks ago

Hi @HYB777. This is a ray config issue - as long as you configure ray on a multi-node cluster, run ray.init appropriately, and use the RayVecEnv, things should work out.

That being said, I haven't tested personally on a multi-node cluster yet.

Since we're not ray developers, I think this question is outside of the scope for support from the tianshou team. However, if you encounter tianshou specific issues on the cluster, feel free let us know!

Ray has a large community and a lot of documentation, I suggest you start there. If you want to contribute a multi-node running example, I'm happy to review a PR