openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
https://arxiv.org/pdf/1706.02275.pdf
MIT License
1.6k stars 484 forks source link

How maddpg update actor? #16

Closed newbieyxy closed 6 years ago

newbieyxy commented 6 years ago

Excuse me, I have a question with a detail in maddpg.py: In function update(), when training p network, why using observation and action from replay buffer, instead of using observation and corresponding action through actor, i.e., obs_n and act(*obs_n)?