oxwhirl / pymarl

Python Multi-Agent Reinforcement Learning framework
Apache License 2.0
1.89k stars 386 forks source link

What's the meaning of return_mean and return_std? / What's the function of rnn agent? #10

Closed TimeBreaker closed 5 years ago

TimeBreaker commented 5 years ago

Hi, thanks for this repo! I have been reading the source code pf pymarl and I have a few questions.

In the output of the program, there are a few parameters like return_mean. I understand most of them but I have trouble understanding return_mean and return_std. What's the meaning of return? (I guess may be the calculation of value function.) And how do you calculate returnalong with return_mean and return_std?

The other question is "why do we use rnn agent?". When I search the word rnn in this repo, I didn't find codes about how the rnn is used in training agents. And when we use algorithms like qmix, is the system still using rnn agent or the system use qmix agent(like overwriting rnn agent).

Thanks again for this repo!

tabzraz commented 5 years ago

return_mean is the total reward for an episode averaged across the different environments that were run at that time. return_std is then just the standard deviation of the episode's returns across the same environments.

The agents are RNN agents by default, they are located here.

All the training code for a Q-Learning agent is located here.

QMIX uses whatever agents you specify (default is an RNN agent).