-
Hi, @HEmile , I test the example in examples/vae/discrete_vae.py, but find the gumbel softmax performs much better than rebar, relax and reinforce (testing loss after 10 epochs: 98 for gumbel, 165 for…
-
I read that you apply a bivariate gumbel sampling in your paper, and use the generalized form gumbel softmax.
Gumbel softmax takes logits (log probability) as input, while you directly use learned st…
-
`
class TPS(M.Module):
def __init__(self, variant='dTPS'):
...
def forward(self, reserved, pruned, now_reserved_policy, now_pruned_policy):
...
B, N, _ = reserve…
-
`logits_py` should be the log of the current `logits_py` or we can just give it as `probs` and not `logits` to the corresponding distribution.
-
Even after a bigger run, agents don't learn:
according to the pressurplate we have a reward in [-0.9,0] if the agent is in the same room of the assigned plate and reward [-1,...,-N] otherwise.
I tri…
-
Following #548 discussion, and while we wait for discrete latent variables, it would be nice to have a Gumbel-Softmax categorical approximation as featured in Pyro. Didn't realize this was the name gi…
-
What TensorfFlow version is needed to run the MNIST example? Thank you!
-
Thanks for your documentation and transparency here
Quick question, in `sample_from_softmax(logits, disallow=None)`, you return:
`tf.one_hot(tf.argmax(tf.nn.softmax(logits + gumbel_noise), -1, …
-
Hi Shariq,
In your code you update the value function with actions computed by:
1) [gumbel_softmax](https://github.com/shariqiqbal2810/maddpg-pytorch/blob/40388d7c18e4662cf23c826d97e209df9003d86c/…
-
### 📚 The doc issue
[Gumbel-Softmax documentation](https://pytorch.org/docs/stable/generated/torch.nn.functional.gumbel_softmax.html) states that the ``logits`` argument should be unnormalized. Howev…