openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
https://arxiv.org/pdf/1706.02275.pdf
MIT License
1.6k stars 484 forks source link

The result is not that ideal like the paper showed #23

Open Jarvis-K opened 5 years ago

Jarvis-K commented 5 years ago

I just run maddpg in simple_speaker_listener several times,but none of them get the -20 avg-reward like the paper proposed. Are there anything i should modify to get a better or more stable result?

4rzael commented 5 years ago

Looks like you're not the only one having trouble reproducing some results: #12

BolunDai0216 commented 5 years ago

I am getting -60 rewards, is that normal for just running the code without any alternations?

KK666-AI commented 4 years ago

Also, in scenario=simple_speaker_listener, this code cannot converge to the result reported in Fig.4. Anyone knows the problem?