agakshat / maddpg

Implementation of Multi-Agent Deep Deterministic Policy Gradients
35 stars 9 forks source link

What should probs be in Line 42 of actorcritic.py #1

Closed namidairo777 closed 6 years ago

namidairo777 commented 6 years ago

out = tf.contrib.distributions.RelaxedOneHotCategorical(0.1,probs=out)

agakshat commented 6 years ago

Hi @namidairo777 : This repo is still a work in progress, I hope to have it running soon. About the RelaxedOneHotCategorical Distribution, probs is supposed to be the prior of each class in the Gumbel-Softmax estimator.

namidairo777 commented 6 years ago

Sorry for the late reply! Thank you so much for answering me. I got a lot of inspirations of implementing MADDPG from your repo!

agakshat commented 6 years ago

Glad to hear it. I'm closing this issue for now.