The-Learning-And-Vision-Atelier-LAVA / SMSR

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
237 stars 30 forks source link

question about gumbel softmax #9

Open XiaoyuShi97 opened 3 years ago

XiaoyuShi97 commented 3 years ago

Hi, nice work! I am a bit confused about gumbel softmax. You mention in your paper that, during traininig, gumbel softmax is used. I wonder if it can be replaced by pure softmax (i.e. torch.softmax)? Could you please give more explanation on this design choice? Thx!

LongguangWang commented 3 years ago

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

wangqiim commented 2 years ago

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

https://github.com/The-Learning-And-Vision-Atelier-LAVA/SMSR/blob/daac49c9a107778c95e11a16fd5b4a8b45513678/model/smsr.py#L12-L21

image

I found the implement of gumbel softmax in your code is different from original paper("Categorical reparameterization with gumbel-softmax"), why do you modify this? which is better?