Closed vbaddam closed 1 year ago
Thanks, @vbaddam, MARL it's an exciting research field that we would love to get into. May I ask which algorithm are you thinking of contributing?
Two related papers/projects recently caught my attention https://arxiv.org/abs/2209.10485 and https://github.com/oxwhirl/smacv2.
I'm thinking of starting with the MADDPG, since it is one of the first MARL algorithms that came out. Then by extending it to MATD3 and MSAC (offline and direct implementation of single RL algorithms), as these algorithms overcome the issues of MADDPG. Along with that, we can look at implementing MAPPO (the significant results are published here: https://arxiv.org/pdf/2103.01955.pdf)
Thanks for sharing the paper. It looks resourceful.
Oh cool. What is the simulation environment for MADDPG? Is it MuJoCo?
I will start with MultiAgent MuJoCo (https://github.com/schroederdewitt/multiagent_mujoco) since it could help us see the direct difference between Single Agent and Multi-Agent
@vbaddam @vwxyzjn I can give a hand for implementing MARL algos. I have a working MAPPO + MAMujoco implemented in torch. I think the big difference from single-agent code is the design about parameter sharing and agent and data handling. Please let me know if we should discuss this elsewhere.
Thanks a lot @vbaddam and @51616. This is the perfect place to discuss. MADDPG in MultiAgent MuJoCo sounds great. Hope its installation won't cause too many issues (e.g., dependencies conflict). One quick suggestion I have is to maybe implement it in JAX, since DDPG with JAX is a lot faster, and parameter sharing is more intuitive in JAX. That said, feel free to pick your tech stack.
Yes, Sure. @51616. It would be great to you have on board. Should we set up a meeting and discuss the structure so we can be on the same page?
@vwxyzjn I think using JAX is a good suggestion. However, I'm still catching up with JAX. Maybe I can implement it using PyTorch and extend it for the next iterations.
@vbaddam Sounds great to me! Please hit me up after the holiday.
Hello everyone! I just wanted to jump into the conversation. @Kallinteris-Andreas has done an amazing work refactoring the MultiAgent MuJoCo environments into the Gymnasium-Robotics repo. They are using the Pettingzoo API and the documentation can be found here https://robotics.farama.org/envs/MaMuJoCo/ma_half_cheetah/.
We are actively maintaining this repo and it would be great seeing benchmarks of these environments with new CleanRL MARL algos.
Also, we are waiting to benchmark these environments before making a new release, so for the time being they have to be installed from source.
Hello everyone! I have been working on MARL with JAX and PyTorch and can be of help. Let me know what you guys have planned out.
@vbaddam Could you make a roadmap + progress tracker for this?
Here is the checklist and progress tracker. I will add the clean Roadmap once we finish the Stage 1.
Hi, I'm also interested to see this being implemented! I have recently been trying to adapt SAC from cleanRL to multi agent setting: https://github.com/ffelten/MASAC.
It is still a WIP but it seems to learn stuff from early results. I would love to hear feedbacks and or tips for such adaptation of algorithms.
Cheers,
Hello! I am working on a MARL project with a couple of other people, and so I'd be interested in roadmap/progress.
Looks like https://github.com/kinalmehta/marl-jax came up. Closing this for now
Contribution to MARL
I would like to contribute to Cleanrl repo by extending RL algorithms to Multi-Agent Systems (i.e MARL). I have discussed the same with @vwxyzjn, and he suggested starting an issue here. If anyone is interested in contributing to MARL, please respond here. Going forward, we can lay out the roadmap and share the responsibilities.
Thank you.