Open axel971 opened 3 years ago
Hi!
It's been a little while since I looked at this, but from what I can tell this factors are talked about in section 4.4 of the paper. The length scale is talked about in more detail in appendix D of the paper and the method of determination is different for the different cases.
In the UCI case, it seems it was chosen based on validation data, and for MNIST a grid search was carried out - the results of which are shown in Figure 11.
Hi ! I read Yarin Gal's paper and I did not understand how the weight regulariser and dropout regulariser are initialized. The author provided a formula, but it is not very clear (e.g what means prior length scale ? and which value to assign for this variable ?). Could you explain how you find the values used to inizialize the weight regulariser and the dropout regulariser ?