yaringal / ConcreteDropout

Code for Concrete Dropout as presented in https://arxiv.org/abs/1705.07832
MIT License
245 stars 68 forks source link

Inconsistency of your code and paper. #1

Open zzd1992 opened 7 years ago

zzd1992 commented 7 years ago

I read your paper Concrete Dropout. I find an inconsistency of your code and paper. The regularizer of kernel matrix should be proportional to 1-p. (Eq.(3) of your paper) But in your code, it is inversely proportional to 1-p.

kernel_regularizer = self.weight_regularizer * K.sum(K.square(weight)) / (1. - self.p)

I am not sure whether I misunderstand your paper or code.

yaringal commented 6 years ago

that's because we reparametrise Wz (with z~Bern(p)^K) as Wz/(1-p) for it to have mean W. Then K.square(weight) has an added term 1/(1-p)^2 which cancels out the 1-p, giving 1/(1-p).

XinDongol commented 6 years ago

Could you please give some information how to derive equ(3) in this paper?

Pran97 commented 5 years ago

Could you please give some information how to derive equ(3) in this paper?

XinDongol its Proposition 1 of Dropout as a Bayesian Approximation Appendix 1 (Gal's previous paper )