Deriving the decomposition into aleatoric and epistemic uncertainty

btlorch commented 4 years ago

Dear Mr. Kwon,

we enjoyed reading your papers on decomposing predictive variance into aleatoric and epistemic uncertainty in classification settings without the need of an extra output layer. Thank you also for sharing your code online.

After reading your derivation to decomposing the predictive variance as given in Appendix A in your paper "Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation", we had difficulties to understand the step from the second to last line to the next. I'm pasting the the two lines over here:

decomposition

Can you please clarify why the outer product (denoted as $\otimes 2$) can be moved outside of the difference? I think sum and outer products cannot be interchanged as demonstrated by the example below, but I might be missing some trick or assumption here.

outer_product_example

Thank you for your help!

ykwon0407 commented 4 years ago

Hello @btlorch. Thank you for your interest in our paper. There are no flaws in your derivation and the last equation is obtained via the integration. Below I elaborate on this.

Let $a = \mathbb{E}{p(y^ | x^, \omega)}(y^*)$ and $b = \mathbb{E}{p(y^ | x^, \mathcal{D} )}(y^*)$. Then, $[a-b]^{\otimes 2} = aa^{T} +bb^{T} -ab^{T} -ba^{T}$ and $\int a p(\omega | \mathcal{D}) d \omega = b$ give $\int [a-b]^{\otimes 2} p(\omega | \mathcal{D}) d \omega = \int aa^{T} p(\omega | \mathcal{D}) d \omega +bb^{T} -bb^{T} -bb^{T} = \int aa^{T} -bb^{T} p(\omega | \mathcal{D}) d \omega$. (Please note that $b$ is constant with respect to $p(\omega | \mathcal{D})$. I hope this helps you.

btlorch commented 4 years ago

Thanks a lot for clarifying!

ykwon0407 / UQ_BNN

Deriving the decomposition into aleatoric and epistemic uncertainty #9