Open theogf opened 4 years ago
A friend referred me to a similar Gaussian method used for deep learning : A Simple Baseline for Bayesian Uncertainty in Deep Learning which has definitely a lot traction in the Bayesian Deep Learning community. I don't know if we could try to compare with them.
The TLDR is :
Here is a result on a XOR dataset with a 3 layers networks with (3/2/1) nodes (20 parameters) with 50 particles:
Black borders indicate different layers and weights/bias
One additional observation (on a different dataset where data is more noisy and mixed but probably true in general). This is how the histogram on each weight is: If I am not mistaking, if the samples were Gaussian, they should be Gaussian on each dimension ?
Rejoice! Mean-field is here! Prediction is taking the mean of the particles, while MC Mean/Var is making the prediction using hte average of the prediction of each particle. Here I just used 10 particles
Work with multi-layer bayesian neural networks and compare it with more classical methods (ADVI).