Closed ShellingFord221 closed 4 years ago
Both standard deviation / variance and entropy are ways of measuring uncertainty. MC estimating the predictive distribution of a BNN which parametrizes a Gaussian distribution induces a GMM in output space. Obtaining the entropy of a GMM is difficult. Thus, standard deviation is used instead.
If your BNN parametrizes a Gaussian predictive distribution, you can indeed give ~95% confidence intervals as 2*std. The aleatoric and epistemic uncertainty decomposition is just a tool used to understand where the uncertainty is coming from: from noise in the data or from model uncertainty.
Hi, in bbp_homoscedastic.ipynb, you calculate aleatoric uncertainty as sigma in Gaussian, and epistemic uncertainty as standard deviation of model's outputs, and the total uncertainty is (aleatoric2 + epistemic2)**0.5. But according to the decomposition of predictive uncertainty, aleatoric uncertainty is the expected entropy of model predictions and epistemic uncertainty is the difference between total entropy and aleatoric entropy. I wonder that are these two ways actually the same? In a classification task, I can easily calculate aleatoric and epistemic uncertainty in the way of entropy, but I don't know how to calculate them in your way.
Besides, I also have a question about the meaning of uncertainties. We often say that BNN is more robust because we can give the uncertainty (or confidence) of the output, but aleatoric and epistemic uncertainties are definitely different from the probabilistic perspective. For example, if the measurements are drawn from Gaussian, then we can say that we have the probability of 95% of the confidence interval of -2 sigma~2 sigma. But in BNN, we just can't give the confidence of output like that (for example the classification task). So rather than using aleatoric and epistemic uncertainties in the loss function to make the model perform better, can we use them to give the confidence of output to make the predictions more acceptable for some areas like medical diagnosis? (For example, after predicting by a BNN, I have the confidence of 95% to assure that the patient is healthy (label 0) and 5% to think that he is a lung cancer patient (label1) rather than only giving one result by point-estimated neural network.) Thanks!