Share the weights of learned policies.

Hi,

I'm wondering if it's possible to share the weights of the learned policies (MAPPO and QMIX). It's helpful for the adversarial MARL community. It will help us to focus on prototyping adversarial algorithms (black-box setting) rather than spending a lot of time trying to train a policy from scratch.

Thanks.

oxwhirl / smacv2

Share the weights of learned policies. #47