about the reconstruction error of combined model

pmorerio / dl-uncertainty

"What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?", NIPS 2017 (unofficial code).

MIT License

207 stars 36 forks source link

about the reconstruction error of combined model #3

Open ShellingFord221 opened 4 years ago

ShellingFord221 commented 4 years ago

Hi, in your combined model, when testing, the computation of reconstruction error is tf.square(self.rec_images2 - self.images) (line 93), I wonder that why don't you use the output of modeling epistemic uncertainty (i.e. self.rec_images) to compute reconstruction error when testing? Thanks!

pmorerio commented 4 years ago

self.rec_images is actually a concatenation of all the 20 sampled predictions (line 74), so you should use their average (you should modify line 75 by replacing the _ with a suitable variable).

pmorerio commented 4 years ago

I wonder that why don't you use the output of modeling epistemic uncertainty

There is no fundamental reason. I just wanted to compare variance with reconstruction error and I thought subtracting ground-truth images from the single output with no dropout was more reasonable.

ShellingFord221 commented 4 years ago

As for the output of the model, in epistemic-uncertainty/model.py, it seems that when testing, you compute the variance of self.test_trials outputs as the epistemic uncertainty, then you get the output again as rec_images by disabling dropout. I think that the average of outputs with dropout should be rec_images to compute test error. (Besides, I also think that when training, average of outputs should take part in the loss function, i.e. sampling should take place in both training and testing when considering dropout as variational Bayesian approximation rather than regularization)

pmorerio commented 4 years ago

I think that the average of outputs with dropout should be rec_images to compute test error.

In principle, disabling dropout in test should act as averaging models, since dropout can be interpreted as a model-averaging technique: you train different sampled networks with dropout at training and take the full network at test, which is similar to take expectation under the Bernoulli (in fact you divide by the dropout probability in most implementations).

I also think that when training, average of outputs should take part in the loss function, i.e. sampling should take place in both training and testing

Not so sure about this. Average of outputs in the loss function would act as if you disable dropout during training.

ShellingFord221 commented 4 years ago

Sorry I'm confused. In What Uncertainty Do We Need in Bayesian Deep Learning for Computer Vision, it says that

Dropout variational inference ... is done by training a model with dropout before every weight layer, and by also performing dropout at test time to sample from the approximate posterior (stochastic forward passes, referred to as Monte Carlo dropout)

And according to the theory of Monte Carlo, \int p(y | x, w) p(w | data) dw = 1/N \sum_{n = 1}^N p(y | x, w_n), where w_n are samples from p(w | data), so why do you take the full network at test? I'm really confused...

pmorerio commented 4 years ago

Hi, I am doing dropout at test time and collecting 20 samples. With those samples I compute variance at line 75. I agree with you that I should also check the mean of the posterior here, something like

self.mean , self.var = tf.nn.moments(self.rec_images, axes=[0])

However this is the summary for self.mean, which is very bad.

Instead, taking the full network at test is equivalent to taking the expectation under a Bernoulli distribution of all the possible nets sampled with dropout. And the result makes more sense.

ShellingFord221 commented 4 years ago

emmm... let's talk about 'when testing should we use dropout and average or not ' in another way. In combined model (also when testing), you disable dropout and get tf.exp(self.log_var2) as the aleatoric uncertainty. While in What uncertainty paper, Eq. 9 represents the combined model's uncertainty, and the third term is the aleatoric uncertainty. The variance is averaged over T samples, which I think is a proof that we should use dropout when testing to compute (at least) variance of the data. If I'm wrong, please correct me! Thanks!

pmorerio commented 4 years ago

Yes, this is quite convincing. If you want to modify the code accordingly and open a pull request I'll be glad to merge it. At the moment I am quite busy to it myself.

ShellingFord221 commented 4 years ago

Yeah, sure!