A Question about the hyperparameters and MARL

eugenevinitsky / sequential_social_dilemma_games

Repo for reproduction of sequential social dilemmas

MIT License

384 stars 134 forks source link

A Question about the hyperparameters and MARL #148

Closed yurunsheng1 closed 5 years ago

yurunsheng1 commented 5 years ago

Hi, I am wondering whether the Inequity Aversion method and the Social Influence method need extensive hyperparameter sweep to reach a good score?

Also, Do you try any centralized Multi-agent RL methods (like VDN or MADDPG) in the social dilemma environments? do they work well?

Thank you!

joonleesky commented 5 years ago

Actually, I've tried implementing Inequity Aversion method with environments provided by eugenevinisky and it works reasonably well without extensive hyperparameter seep. However, default hyper parameter setting provided in train_baseline.py was not good enough.

For me, I have not tried the other methods yet.

eugenevinitsky commented 5 years ago

Thanks for your input @joonleesky! As for us, we have not tried other methods for these environments; we primarily implemented them to test out the causal reward, and that is fully decentralized whereas MADDPG and VDN are not. Personally, I would love to see the results of using those algorithms on these envs!

yurunsheng1 commented 5 years ago

Thank you so much! @joonleesky @eugenevinitsky