I'm wondering if it's possible to share the weights of the learned policies (MAPPO and QMIX). It's helpful for the adversarial MARL community. It will help us to focus on prototyping adversarial algorithms (black-box setting) rather than spending a lot of time trying to train a policy from scratch.
Hi,
I'm wondering if it's possible to share the weights of the learned policies (MAPPO and QMIX). It's helpful for the adversarial MARL community. It will help us to focus on prototyping adversarial algorithms (black-box setting) rather than spending a lot of time trying to train a policy from scratch.
Thanks.