Open KuoZhong opened 4 years ago
maddpg_o/experiments/compete.py
for evaluation.As for your result, my guess is that you did not add the --no-wheel
option to remove the reward shaping.
Thank you for your advice. With the --no-wheel option, I get the comparable average return of the first two stages for the adversary battle game. But at the last stage where there are 32 agents, I get 144.1929, which is not consistent with your result.
So, could you please tell me the reason for my result? Should I also add the --no-wheel option into the selection?
Thank you.
The work inspires me a lot. I am trying to run the code and find that the empirical result given by the code is not consistent with that of the last column in D2 on the adversary battle game. In my result, I get 18.39 for 4-4 and 52.21 for 8-8, where there is a huge performance gap. Since I have not modified the code, the problem may be caused by the evaluation settings or something else. I will list the problems that may be important below:
thank you so much.