PKU-RL / I2C

41 stars 12 forks source link

Installation

Environment options

Core training parameters

Training for prior network

Checkpointing

Training procedure

I2C be learned end-to-end or in a two-phase manner. This code is implemented for end-to-end manner which could take more training time compared with the latter manner

For Cooperative Navigation, python3 train.py --scenario 'cn' --prior-training-percentile 60 --lr 1e-2

For Predator Prey, python3 train.py --scenario 'pp' --prior-training-percentile 40 --lr 1e-3

Citations

If you are using the codes, please cite our paper.

Ziluo Ding, Tiejun Huang, and Zongqing Lu. Learning Individually Inferred Communication for Multi-Agent Cooperation. NeurIPS'20.

@inproceedings{ding2020learning,
        title={Learning Individually Inferred Communication for Multi-Agent Cooperation},
        author={Ding, Ziluo and Huang, Tiejun and Lu, Zongqing},
        booktitle={NeurIPS},
        year={2020}
}

Acknowledgements

This code is developed based on the source code of MADDPG by Ryan Lowe