IntelLabs / bayesian-torch

A library for Bayesian neural network layers and uncertainty estimation in Deep Learning extending the core of PyTorch
BSD 3-Clause "New" or "Revised" License
482 stars 67 forks source link

Calculation of ELBO in Bayesian layers seems to mismatch the equation in the original paper #26

Closed dzy666fly closed 9 months ago

dzy666fly commented 10 months ago

Hi! Thank you so much for sharing so exciting library for Bayesian Deep Learning. When I try to use Conv2dReparameterization in Bayesian layers, I find the calculation of ELBO lacks of sampling, while in the original paper sampling is necessary. What's more, the mixture Gaussian prior also seems to not be implemented. So why do you implement the Bayesian layer without considering sampling and mixture Gaussian prior? Need I add them by myself? Looking forward to your reply.

ranganathkrishnan commented 9 months ago

Hi @dzy666fly , Thank you for using the library and your questions. Monte Carlo (MC) sampling is performed to marginalize and compute ELBO, please refer to the example below, where you can choose the number of MC samples: https://github.com/IntelLabs/bayesian-torch/blob/93cf0d3d43d399da3b2d114e2e257389b3354d05/bayesian_torch/examples/main_bayesian_cifar_dnn2bnn.py#L401

We have implemented Gaussian prior (which is also discussed in original paper) for scaling variational inference to larger Bayesian neural networks (https://ojs.aaai.org/index.php/AAAI/article/view/5875).

I hope this helps.

dzy666fly commented 9 months ago

Thank you so much for the timely reply. I misunderstood the ELBO in the training process before. Now maybe the only confusion is the scaled mixture of two Gaussian distribution in the original paper (Weight Uncertainty in Neural Networks). I will try to implement it by myself. By the way, if I use this library in my code for the paper, how should I cite this library? On the footers in the paper page or just on the code releasing website (github) in the Readme.md?

ranganathkrishnan commented 9 months ago

Thank you so much for the timely reply. I misunderstood the ELBO in the training process before. Now maybe the only confusion is the scaled mixture of two Gaussian distribution in the original paper (Weight Uncertainty in Neural Networks). I will try to implement it by myself. By the way, if I use this library in my code for the paper, how should I cite this library? On the footers in the paper page or just on the code releasing website (github) in the Readme.md?

@dzy666fly Gaussian prior is commonly used in mean-field variational inference for Bayesian neural networks. Please feel free to send PR if you implement the support for scale mixture prior.

Here is the BibTeX: @software{krishnan2022bayesiantorch, author = {Ranganath Krishnan and Pi Esposito and Mahesh Subedar},
title = {Bayesian-Torch: Bayesian neural network layers for uncertainty estimation}, month = jan, year = 2022, doi = {10.5281/zenodo.5908307}, url = {https://doi.org/10.5281/zenodo.5908307} howpublished = {\url{https://github.com/IntelLabs/bayesian-torch}} }