Closed GoingMyWay closed 4 years ago
What QMIX code are you running?
What QMIX code are you running?
Thanks for fast response, I use the code: https://github.com/oxwhirl/smac/blob/master/smac/examples/rllib/run_qmix.py
Have you tried using oxwhirl/pymarl? That's the codebase the results of our paper are based on. The RLlib examples here are provided by our friends from Berkeley who integrated QMIX with their RLlib library. Therefore, there are likely to be differences in results.
Have you tried using oxwhirl/pymarl? That's the codebase the results of our paper are based on. The RLlib examples here are provided by our friends from Berkeley who integrated QMIX with their RLlib library. Therefore, there are likely to be differences in results.
Wow, looks good, I did not try it. I will try it. Are there any issues in QMIX from RLlib? @richardliaw
There are some notable differences. For example, the RLlib qmix code doesn't use the global state that smac provides and instead uses only per-agent observations.
There are some notable differences. For example, the RLlib qmix code doesn't use the global state that smac provides and instead uses only per-agent observations.
Thanks, that's a big difference. So, I think that is the reason why QMIX from Rllib fails to learn a good policy.
No problem. Let us know should you face any issues.
I run the example code of qmix and it seems that it cannot learn good policy. The reward is nearly random policy.