google / neural-tangents

Fast and Easy Infinite Neural Networks in Python
https://iclr.cc/virtual_2020/poster_SklD9yrFPS.html
Apache License 2.0
2.27k stars 225 forks source link

How to obtain aleatoric uncertainty? #188

Open bangxiangyong opened 1 year ago

bangxiangyong commented 1 year ago

I am aware that the default inference implemented is based on Mean squared error (MSE) loss. Is there an implemented example or a way to obtain aleatoric uncertainty instead (either homoscedestic or heteroscedestic)? i.e. learning to output the variance of an isotropic Gaussian distribution.

romanngg commented 1 year ago

NNG and NTK give you the full posterior distribution on the test set (mean and full, non-isotropic covariance), check out these functions:

https://neural-tangents.readthedocs.io/en/latest/_autosummary/neural_tangents.predict.gp_inference.html#neural_tangents.predict.gp_inference

https://neural-tangents.readthedocs.io/en/latest/_autosummary/neural_tangents.predict.gradient_descent_mse_ensemble.html#neural_tangents.predict.gradient_descent_mse_ensemble

These correspond to equations from 13 to 16 in https://arxiv.org/pdf/1902.06720.pdf

We also use these to plot uncertainties on the outputs in the cookbok https://colab.sandbox.google.com/github/google/neural-tangents/blob/main/notebooks/neural_tangents_cookbook.ipynb

With some math from uncertainties on the outputs you can also derive the uncertainties on the MSE loss as we do in Figure 1 of https://arxiv.org/pdf/1912.02803.pdf

Lmk if this helps!

bangxiangyong commented 1 year ago

Thanks for the reply!

I may be limited by knowledge; but wouldn't an MSE ensembled loss only capture epistemic uncertainty i.e. uncertainty about the possible models instead of uncertainty within the data?

I have knowledge about Bayesian neural networks (BNN) and am trying to draw the parallel to NNGP/NTK inferences when it comes to estimating the aleatoric uncertainty and was expecting something along the lines of training under NLL of Gaussian loss (instead of MSE). To estimate aleatoric uncertainty the BNN architecture has dual outputs in the heteroscedestic setup (one for mean and one for variance of the Gaussian); whereas in the homoscedestic setup a free parameter is used for estimating the Gaussian variance. I guess one approach would be modifying the loss function of the NNGP to a Gaussian NLL (instead of MSE); However, i fail to find an example that does so. For references on estimating aleatoric uncertainty I am referring to setups such as the ones below:

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252108 (Equation 12)

https://proceedings.neurips.cc/paper_files/paper/2017/file/2650d6089a6d640c5e85b2b88265dc2b-Paper.pdf (Equation 5)

I hope i am making some sense!..