Closed caxelrud closed 4 years ago
Hi,
The idea is sound. Like you say, you can successfully update your posterior by using your previous posterior as a prior. I think the method you are referring to is the one described (here https://arxiv.org/abs/1710.10628). However, when doing MFVI with Gaussian approximate distributions, you are losing a lot of information about the posterior in your approximations. Using a poor approximate posterior will likely not preserve much information when you use it as a prior as there may be a strong mismatch between the likelihood function of the new data and the prior. As discussed (here https://arxiv.org/pdf/1902.06494.pdf), this method can lead to poor results.
With regards to implementation, you would have to define a per-weight prior, as right now I think we use a single prior for the whole network. Other than that, I believe it is pretty straight forward given the existing codebase. Let me know if you run into any trouble when implementing it and I'll be happy to take a look!
(New Feature) Do you think that BNN Bayesian online learning for this code implementation will perform well? I mean, periodic BNN retraining using small amounts of data and using the posterior as the prior.
If positive, do you have recommendation/ideas the best way to implement it? (like using the full network from the previous iteration).