Closed piotti closed 3 years ago
TL;DR: No plans to support and no easy way to run between multiple nodes.
1) There are no plans to support MPI in new algorithms, and in fact have been completely dropped (at least for now) in the next iteration of stable-baselines. It is not a high priority due to complexity and relatively small amount of use uses.
2) Closest available solution is "SubProcVecEnv" that parallelize environment sample gathering to different processes but not between nodes. Such VecEnv could be written for multiple-node runs where environments run on different nodes, but the environment has to be very slow to get any benefit from it.
Considering the work required to get this working and possibly small gains, I recommend using the multiple nodes to run parallel runs of the same experiment with random seeds instead, as this is a crucial part of reliable RL results.
I would also add that you can always use PPO1 if you want to use mpi ;)
Understood, thanks!
I'm looking to scale training of my PPO2 algorithms to multiple nodes of an HPC cluster. Currently, I'm constrained to the CPUs of a single node.
It looks like MPI would allow me to distribute the training to mutiple nodes, but the documentation says PPO2 doesn't support MPI.
However, this thread makes it sound like OpenAI Baselines PPO2 now supports MPI:
My questions are: