Gumble softmax implementation error

Hi there,

Thanks a lot for your comment! For this implementation, we kind of just followed along with most open-source implementations for consistency, such as

the official Gumbel-Softmax implementation [1][2]
the official MADDPG implementation [3]
the PyTorch MADDPG implementation [4]
the official SQDDPG implementation [5]

In practice, with NN weights properly initialized, there’s unlikely any material difference, though to be precise we should use log_softmax on the network outputs as you suggested.

[1] https://github.com/ericjang/gumbel-softmax/blob/master/Categorical%20VAE.ipynb, block 4 [2] https://github.com/ericjang/gumbel-softmax/blob/master/gumbel_softmax_vae_v2.ipynb, block 5 [3] https://github.com/openai/maddpg/blob/master/maddpg/trainer/maddpg.py#L45 [4] https://github.com/shariqiqbal2810/maddpg-pytorch/blob/master/algorithms/maddpg.py#L143 [5] https://github.com/hsvgbkhgbv/SQDDPG/blob/master/models/maddpg.py#L104

mzho7212 / LICA

Gumble softmax implementation error #1