Question about the Gumbel Softmax code.

icandle / CAMixerSR

CAMixerSR: Only Details Need More “Attention” (CVPR 2024)

https://arxiv.org/abs/2402.19289

Apache License 2.0

209 stars 11 forks source link

Question about the Gumbel Softmax code. #27

Closed csguoh closed 2 weeks ago

csguoh commented 2 months ago

Hi, authors.

Sorry to bother you. I have tried your code and found that the one-hot vector with gumbel softmax is generated with some-linear --> softmax --> F.gumbel_softmax. However, in the code implementation of the DynamicViT, the one-hot vector is generated with some-linear --> Log-softmax --> F.gumbel_softmax. Is there some difference between the two, or whether it can influence the performance?

Thx.

icandle commented 2 months ago

We actually tried these tow different implementations. The softmax performs slightly better than logsoftmax in SR tasks.