Why ‘u_action_space = spaces.Discrete(world.dim_p * 2 + 1)’？

openai / maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

https://arxiv.org/pdf/1706.02275.pdf

MIT License

1.59k stars 484 forks source link

Why ‘u_action_space = spaces.Discrete(world.dim_p * 2 + 1)’？ #66

Open ScorpioPeng opened 2 years ago

ScorpioPeng commented 2 years ago

Hi, I dont understand 'u_action_space = spaces.Discrete(world.dim_p * 2 + 1)' I know that action[0] is the communication, but why dim_p needs to mutiply 2

benalcazardiego commented 2 years ago

I second this question

benalcazardiego commented 2 years ago

I think they reduce it to small steps in simple directions, so you can move in 2 times the dimension. If it is a 2D world. you can move left, right top bot, if it were a 3d in +-i, +-j, +-z and I assume the extra action is to stay quiet.