yaringal / ConcreteDropout

Code for Concrete Dropout as presented in https://arxiv.org/abs/1705.07832
MIT License
245 stars 68 forks source link

why upscaling weight by 1/1-p after concrete dropout #16

Open heechanlee opened 3 years ago

heechanlee commented 3 years ago

https://github.com/yaringal/ConcreteDropout/issues/3#issuecomment-352718724

I can't find why weights are upscaled by 1/1-p after concrete dropout in the paper. Can anyone tell me why?

heechanlee commented 3 years ago

https://github.com/yaringal/ConcreteDropout/issues/1#issuecomment-337313695

From the comment above, I guess it is for making mean of weights after dropout be W. Is there any reference why we should do that?