Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Same action in multi-agent environment #32

Closed lyp741 closed 5 years ago

lyp741 commented 5 years ago

Hello, thank you for your contribution! I am a student and recently I am running a multi-agent program with your code and I am suffering from a problem. I'm using Unity3d to simulate multi-robots experiments and send the observations(a camera image and several sensors' information) to the python script. When I feed the states to the network, the output actions are the same. e.g. We have 9 agents and each agent can choose 8 different actions, these actions can be {0,1,2,3,4,5,6,7}, when we feed the state to network, the outputs are always the same action for every time step, such as {1,1,1,1,1,1,1,1,1} or {2,2,2,2,2,2,2,2,2}, etc. Do you have any idea about this kind of problem?

Kaixhin commented 5 years ago

It's hard to tell exactly what you want to do, but one way to use this for multi-agent RL would be to run separate instances for each agent, but have the environment coordinate. This has only been setup to run with one agent per instance, so it's possible that the code doesn't properly deal with what you're trying to do (though I'd also inspect the data to make sure you're getting separate operations). Even if you have set up the code correctly, it might just have learned to do this, but it does seem a bit unlikely.