yookoon / density_uncertainty_layers

2 stars 0 forks source link

Computation of Predictive Uncertainty in Regression Setting #1

Open nilsleh opened 2 weeks ago

nilsleh commented 2 weeks ago

@yookoon Thank you for the interesting work and the code repository. We are trying to support your proposed method in our UQ-Library for Deep Learning called Lightning-UQ-Box and have a couple of questions regarding the regression framework.

To take the Regression case based on the UCI datasets for example, the MLP module has a logvar parameter that is a single value for the entire model. During the training and test phase the loss is computed with this homoscedastic logvar parameter here. Additionally, for sampling methods like BNNs or your proposed Density framework, N samples are taken in this loop, however, you only compute the average over the model mean predictions, and the sampling of the weights has no effect on the predictive uncertainty you are computing, since it is just a single learned parameter for the entire model. This seems counterintuitive as the point of BNNs is to model epistemic uncertainty which should influence the overall predictive uncertainty that I get from the model.

In Figure 1 of your paper, you show results on a Toy Regression dataset with input dependent uncertainty, however, the repository does not contain the code to generate this figure as far as I can tell. In your paper you state that based on equations 7-9 "Consequently, predictive uncertainty will be high for test inputs that are improbable in the training density and low for those that are more probable, providing intuitive and reliable predictive uncertainty." However, I fail to see how that can be the case, when you are using a single logvar parameter as your predictive uncertainty. I was therefore wondering whether you could help me out in understanding the utilized notion of predictive uncertainty in the regression case. Thanks in advance!

yookoon commented 1 week ago

Hi Nils,

Thank you for your interest in the paper. The density uncertainty layer is similar to the approximate BNN methods like Variational Dropout and Rank-1 BNNs. These methods do not sample the weights of neural networks but instead injects noise into the activations of the layers. We can show that this corresponds to actually modeling the uncertainty in weights but marginalizing them out when computing the activations of the layers. For example, the predictive distribution of Bayesian linear regression is obtained by marginalizing out the weight posterior.

Similarly, the density uncertainty layers injects noise into the layer activations, so when you run it multiple times, the prediction means will always be different. Then we can use the variance of these predictions as a measure of epistemic uncertainty. As you mentioned we didnt incorporate the uncertainty of the output variance parameter but we think the effect will be minimal as it is just a single parameter.

Hope this answers your question and let me know if you have anything else