xuehy / pytorch-maddpg

A pytorch implementation of MADDPG (multi-agent deep deterministic policy gradient)
615 stars 122 forks source link
multiagent-reinforcement-learning pytorch-rl

+TITLE: An implementation of MADDPG

+AUTHOR: xuehy

+EMAIL: hyxue@outlook.com

+STARTUP: content

This is a pytorch implementation of [[https://arxiv.org/abs/1706.02275][multi-agent deep deterministic policy gradient algorithm]].

The experimental environment is a modified version of Waterworld based on [[https://github.com/sisl/MADRL][MADRL]].

The main features (different from MADRL) of the modified Waterworld environment are:

if scene rendering is enabled, recommend to install =opencv= through [[https://github.com/conda-forge/opencv-feedstock][conda-forge]].

** two agents, cooperation = 2 The two agents need to cooperate to achieve the food for reward 10.

[[PNG/demo.gif]]

[[PNG/3.png]]

the average

[[PNG/4.png]]

** one agent, cooperation = 1

[[PNG/newplot.png]]