danielkelshaw / ConcreteDropout

PyTorch implementation of 'Concrete Dropout'
https://arxiv.org/abs/1705.07832
MIT License
14 stars 2 forks source link

Weight regulariser and dropout regulariser #14

Open axel971 opened 3 years ago

axel971 commented 3 years ago

Hi ! I read Yarin Gal's paper and I did not understand how the weight regulariser and dropout regulariser are initialized. The author provided a formula, but it is not very clear (e.g what means prior length scale ? and which value to assign for this variable ?). Could you explain how you find the values used to inizialize the weight regulariser and the dropout regulariser ?

danielkelshaw commented 3 years ago

Hi!

It's been a little while since I looked at this, but from what I can tell this factors are talked about in section 4.4 of the paper. The length scale is talked about in more detail in appendix D of the paper and the method of determination is different for the different cases.

In the UCI case, it seems it was chosen based on validation data, and for MNIST a grid search was carried out - the results of which are shown in Figure 11.