The implementation of the following methods can be found in this codebase:
CUBLAS_WORKSPACE_CONFIG=:16:8 python3 src/main.py --config=fop/dmac/dfop --env-config=sc2 with env_args.map_name=2c_vs_64zg seed=1
Environment variable CUBLAS_WORKSPACE_CONFIG is recommended to enforce deterministic behavior of RNN.
If you are using the codes, please cite our papers.
@inproceedings{zhang2021fop,
title={Fop: Factorizing optimal joint policy of maximum-entropy multi-agent reinforcement learning},
author={Zhang, Tianhao and Li, Yueheng and Wang, Chen and Xie, Guangming and Lu, Zongqing},
booktitle={International Conference on Machine Learning (ICML)},
pages={12491--12500},
year={2021},
organization={PMLR}
}
Kefan Su and Zongqing Lu. Divergence-Regularized Multi-Agent Actor-Critic. ICML'22.
@inproceedings{su2022divergence,
title={Divergence-regularized multi-agent actor-critic},
author={Su, Kefan and Lu, Zongqing},
booktitle={International Conference on Machine Learning (ICML)},
pages={20580--20603},
year={2022},
organization={PMLR}
}
@inproceedings{wang2023more,
title={More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization},
author={Wang, Jiangxing and Ye, Deheng and Lu, Zongqing},
booktitle={International Conference on Learning Representations (ICLR)},
year={2023}
}