Closed ShellingFord221 closed 4 years ago
Yes, and thank you for bringing this up.
In MC Dropout we apply dropout every time we make a .forward() call, and we compute a MC estimate of the loss using one MC sample (i.e. one forward pass). You are right that the equivalent of BBB for MC Dropout should contain a variable number of samples, and the MC loss should be the average of the sampled losses. Instead, we implicitly (and probably not very transparently) assume no_samples=1 for MC Dropout.
For reference, we did not observe a significant dependence on the quality of our solutions by increasing the number of samples up to ~10 samples. We will modify MC Dropout to be more general, as soon as we get the time.
Hello again! In BBB method, you sample the weight
no_samples
times and average the loss when training (bbp_homo.ipynbdef fit(self, x, y, no_samples)
), but in MC dropout method, you don't sample and just get one loss as the final loss of training (mc_dropout_heteroscedastic.ipynbdef fit(self, x, y)
). But when testing, losses are both sampled. I think the only difference between BBB and MC dropout is that the approximate posterior is assumed as Gaussian by BBB and Bernoulli by MC dropout, so why don't you sample in MC dropout method when training? Thanks!