BNN Bayesian Online Learning

Hi,

The idea is sound. Like you say, you can successfully update your posterior by using your previous posterior as a prior. I think the method you are referring to is the one described (here https://arxiv.org/abs/1710.10628). However, when doing MFVI with Gaussian approximate distributions, you are losing a lot of information about the posterior in your approximations. Using a poor approximate posterior will likely not preserve much information when you use it as a prior as there may be a strong mismatch between the likelihood function of the new data and the prior. As discussed (here https://arxiv.org/pdf/1902.06494.pdf), this method can lead to poor results.

With regards to implementation, you would have to define a per-weight prior, as right now I think we use a single prior for the whole network. Other than that, I believe it is pretty straight forward given the existing codebase. Let me know if you run into any trouble when implementing it and I'll be happy to take a look!

JavierAntoran / Bayesian-Neural-Networks

BNN Bayesian Online Learning #9