PKU-RL / FOP-DMAC-MACPF

10 stars 2 forks source link

FOP-DMAC-MACPF

Note

The implementation of the following methods can be found in this codebase:

Installation

How to run

CUBLAS_WORKSPACE_CONFIG=:16:8 python3 src/main.py --config=fop/dmac/dfop --env-config=sc2 with env_args.map_name=2c_vs_64zg seed=1

Environment variable CUBLAS_WORKSPACE_CONFIG is recommended to enforce deterministic behavior of RNN.

Citation

If you are using the codes, please cite our papers.

Tianhao Zhang, Yueheng Li, Chen Wang, Guangming Xie and Zongqing Lu. FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning. ICML'21.

@inproceedings{zhang2021fop,
        title={Fop: Factorizing optimal joint policy of maximum-entropy multi-agent reinforcement learning},
        author={Zhang, Tianhao and Li, Yueheng and Wang, Chen and Xie, Guangming and Lu, Zongqing},
        booktitle={International Conference on Machine Learning (ICML)},
        pages={12491--12500},
        year={2021},
        organization={PMLR}
}

Kefan Su and Zongqing Lu. Divergence-Regularized Multi-Agent Actor-Critic. ICML'22.

@inproceedings{su2022divergence,
        title={Divergence-regularized multi-agent actor-critic},
        author={Su, Kefan and Lu, Zongqing},
        booktitle={International Conference on Machine Learning (ICML)},
        pages={20580--20603},
        year={2022},
        organization={PMLR}
}

Jiangxing Wang, Deheng Ye and Zongqing Lu. More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization. ICLR'23.

@inproceedings{wang2023more,
    title={More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization},
    author={Wang, Jiangxing and Ye, Deheng and Lu, Zongqing},
    booktitle={International Conference on Learning Representations (ICLR)},
    year={2023}
}