openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
https://arxiv.org/pdf/1706.02275.pdf
MIT License
1.59k stars 484 forks source link

Question about the way how to update actor #65

Open choasLC opened 2 years ago

choasLC commented 2 years ago

I don't really follow the way how you update the actor. From my understanding, the chain rule is required for the gradient of the parameters in the actor, right? But, I did not see that in your train model. I may be wrong, but if you could give me a hint, that will be wonderful!!

chengdusunny commented 2 years ago

Hello, I am not the contributor of this program, but maybe I can help you on this problem. The actor network and Q network is updated together by calling the function loss = agent.update(trainers, train_step), in train.py. It is very convenient to use since the authors have done this for us and what we need to do is to simply call the function.