about sample or not when training

JavierAntoran / Bayesian-Neural-Networks

Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more

MIT License

1.83k stars 302 forks source link

about sample or not when training #6

Closed ShellingFord221 closed 4 years ago

ShellingFord221 commented 4 years ago

Hello again! In BBB method, you sample the weight no_samples times and average the loss when training (bbp_homo.ipynb def fit(self, x, y, no_samples)), but in MC dropout method, you don't sample and just get one loss as the final loss of training (mc_dropout_heteroscedastic.ipynb def fit(self, x, y)). But when testing, losses are both sampled. I think the only difference between BBB and MC dropout is that the approximate posterior is assumed as Gaussian by BBB and Bernoulli by MC dropout, so why don't you sample in MC dropout method when training? Thanks!

stratisMarkou commented 4 years ago

Yes, and thank you for bringing this up.

In MC Dropout we apply dropout every time we make a .forward() call, and we compute a MC estimate of the loss using one MC sample (i.e. one forward pass). You are right that the equivalent of BBB for MC Dropout should contain a variable number of samples, and the MC loss should be the average of the sampled losses. Instead, we implicitly (and probably not very transparently) assume no_samples=1 for MC Dropout.

For reference, we did not observe a significant dependence on the quality of our solutions by increasing the number of samples up to ~10 samples. We will modify MC Dropout to be more general, as soon as we get the time.