-
Are there any code examples for the application of this kind of VAE to semi-supervised classification, as described in section 4.3 of the Gumbel-Softmax paper?
I understand the technical details in t…
-
Are there any demos that can illustrate the process of using MATD3 / MADDPG to process discrete actions?
-
Hello, this is brilliant work, I want to use the binary gumbel-softmax for my work. But there are some problems.
I used the soft mask for the first layer only (just apply the generated mask to the fe…
-
`
class TPS(M.Module):
def __init__(self, variant='dTPS'):
...
def forward(self, reserved, pruned, now_reserved_policy, now_pruned_policy):
...
B, N, _ = reserve…
-
`StatsBase` has several options for [Weight Vectors](https://juliastats.github.io/StatsBase.jl/stable/weights/#Weight-Vectors-1). In many applications, we prefer to work in terms of (up-to-additive-co…
-
Even after a bigger run, agents don't learn:
according to the pressurplate we have a reward in [-0.9,0] if the agent is in the same room of the assigned plate and reward [-1,...,-N] otherwise.
I tri…
-
This is result from
`vae = DiscreteVAE(
image_size = 128,
num_layers = 2, # number of downsamples - ex. 256 / (2 ** 3) = (32 x 32 feature map)
num_tokens = 8192, # nu…
-
Hello, I find `Phi-3.5-MoE` model use the `sparsemixer()` function to select the top-k experts and compute the weights, but I couldn't find this function implementation in the code. Could you give me …
-
Hi @thomasverelst
Congrats, nice work! I have two questions out of curiosity:
1) Forward pass: Why did you choose to sample from the Bernoulli distribution instead of the Gumbel-softmax? To my …
-
Did anyone reproduce the results by training the model using this code?
Looks like there are a few bugs and discrepancies between the code and the paper. At least, in the paper, there is a argmax ops…
SHMCU updated
4 years ago