shariqiqbal2810 / maddpg-pytorch

PyTorch Implementation of MADDPG (Lowe et. al. 2017)
MIT License
569 stars 129 forks source link

Multi-discrete action spaces not working #26

Closed tessavdheiden closed 4 years ago

tessavdheiden commented 4 years ago

Hi!

I was looking into simple_reference.py, which has multiple (2) discrete action spaces (communication + movement). Unfortunately the algorithm doesnt work.

Specifically this line breaks.

Best, Tessa

tessavdheiden commented 4 years ago

Hi,

if you want to fix it, maybe do this: else: discrete_action = True get_shape = lambda x: sum(x.high) + x.num_discrete_space

in maddpg.py, line

and: [acsp.shape[0] if isinstance(acsp, Box) else acsp.n if isinstance(acsp, Discrete) else sum(acsp.high) + acsp.num_discrete_space for acsp in env.action_space])

in main.py, line

tessavdheiden commented 4 years ago

My answer does not work, because here is how the environment converts this into an action..

tessavdheiden commented 4 years ago

Hi!

I am sorry for the spam. I figured out that in the environment and my suggestion it DOES work.

So, finally my suggestion (bit more elegant): get_shape = lambda x: sum(x.high - x.low + 1)

and in main.py: sum(acsp.high - acsp.low + 1)

The MADDPG does not allow to speak and execute a moving action at the same time. The output is a one-hot vector, and the environment converts this into a movement or message. But this is not necessarily a problem, as it still works.