About the input of F.gumbel_softmax

@ZikangZhou Hi, although this is an old thread, I want to share my thoughts here since there's still an open issue pointing here, I hope this doesn't cause any inconvenience.

According to the documentation of F.gumbel_softmax, the logits parameter represents "unnormalized log probabilities". The term "unnormalized" here likely indicates that the logits have not been adjusted to fall within a normalized range by uniformly shifting all components (a normalization example with logsumexp). This normalization step doesn't impact the results of softmax, as any uniform shift gets canceled out during the softmax calculation, according to its definition.(e.g., if we add $a$ to logits $(x, y)$, the softmax results will be the same: $\frac{e^{x+a}}{e^{x+a}+e^{y+a}} = \frac{e^x}{e^x+e^y}$). Therefore, I believe the authors' usage of F.gumbel_softmax is actually appropriate.

You may also check the PyTorch's implementation of F.gumbel_softmax to confirm.

karpathy / deep-vector-quantization

About the input of F.gumbel_softmax #7