Open pythonometrist opened 3 years ago
Hello @pythonometrist , and thank you so much for using BLiTZ.
So, the .eval method relates mostly to dropout (and batchnorm I think) on Torch built-in layers, to stop the randomness and avoid data leakage on the nn.
BLiTZ's freeze and unfreeze model relates to enabling and disabling the reparameterization on the Bayesian Layers. When we freeze a layer, it uses only the mean of the trainable random distribution as the weights for the layer, while when using unfreeze, it enables the random sampling.
This behavior controlling is important specifically when we want to specify bayesian priors for our surrogate distribution, via training the frozen network on the data before enabling randomness and uncertainty.
Hope I' ve helped.
Thanks for an excellent and well thought out framework.
It looks like from a model evaluation perspective - freeze/unfreeze have no role to play . To predict on a held out set it is enough to set torch.no_grad() and model.eval()
After all, all trainable parameters including those of the posterior distributions rho and mu are not affected by freeze/unfreeze (unless we call model.eval()) Is that correct?