Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight)
The framework is inherited from PyMARL. UPDeT is written in pytorch and uses SMAC as its environment.
pip install -r requirements.txt
3rdparty/
folder and copy the maps necessary to run over.bash install_sc2.sh
Before training your own transformer-based multi-agent model, there are a list of things to note.
3m
, 8m
, 5m_vs_6m
. Transformer Parameters
block at src/config/default.yaml
and revise the _build_input_transformer
function in basic_controller.python
.Agent Parameters
block at src/config/default.yaml
.python3 src/main.py --config=vdn --env-config=sc2 with env_args.map_name=5m_vs_6m
All results will be stored in the Results/
folder.
Surpass the GRU baseline on hard 5m_vs_6m
with:
Zero-shot generalize to different tasks:
7m-5m-3m
transfer learning.Note: Only UPDeT can be deployed to other scenarios without changing the model's architecture.
More details please refer to UPDeT paper.
@article{hu2021updet,
title={UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers},
author={Hu, Siyi and Zhu, Fengda and Chang, Xiaojun and Liang, Xiaodan},
journal={arXiv preprint arXiv:2101.08001},
year={2021}
}
The MIT License