Open Acciorocketships opened 1 year ago
Right so the situation is this: In this branch (https://github.com/matteobettini/rl/tree/mappo_ippo) i implemented the examples of MARL in VMAS.
Recently, I am reworking them since we have decided that agent-specific keys will be at a deeper nested level in tensordicts. While this has been simpler for PPO type loss it has brought to light many other bugs related to using nested keys in torchrl #1279, #1278, #1273, #1269, #1268). Which we will solve as soon as possible.
However, I have created a tag for the last working version of all scripts for paper submission https://github.com/matteobettini/rl/tree/torchrl_paper . If you want to use MADDPG I suggest to check out that tag. You will just have to also check out the commit in the tensordict repository closer to the day that the commit referenced by the tag is on.
Thank you, please let me know when it works on the current version!
Also, I think that script might need to be updated with the fixes from this thread: https://github.com/pytorch/rl/issues/1181
Yep, we'll take that into account
MADDPG on that branch now works, what specifically should we add from https://github.com/pytorch/rl/issues/1181? @smorad @Acciorocketships do you have any hints since you used ddpg a lot
You definitely need target networks. I'd also suggest prioritized experience replay and a large replay buffer (and use extend
rather than add
to put things in the buffer).
MADDPG is now working in PR https://github.com/pytorch/rl/pull/1027. We aim to merge that PR soon
Currently, there is a working multi-agent PPO implementation here: https://github.com/matteobettini/rl/blob/mappo_ippo/examples/multiagent/mappo_ippo.py
and a working single-agent DDPG implementation here: https://github.com/pytorch/rl/blob/ddpg_example/ddpg_example.py
However, there does not seem to be a working multi-agent DDPG implementation (the multi-agent DDPG example in the same repo as the first link runs into an error). Would it be possible to provide a multi-agent DDPG example script? I am specifically interested in using it with VMAS.
cc @matteobettini