kumar-shridhar / PyTorch-BayesianCNN

Bayesian Convolutional Neural Network with Variational Inference based on Bayes by Backprop in PyTorch.
MIT License
1.42k stars 323 forks source link

Old version of KL Divergence #57

Closed maltetoelle closed 3 years ago

maltetoelle commented 4 years ago

Hi @kumar-shridhar @Piyush-555 ,

I am currently working on a project utilizing BayesByBackprop for image reconstruction with autoencoders. It does work good. However, I have a question regarding the old version of the calculation of the KL Divergence:

def kl_loss(self): return self.weight.nelement() / self.log_alpha.nelement() * calculate_kl(self.log_alpha)

def calculate_kl(log_alpha): return 0.5 * torch.sum(torch.log1p(torch.exp(-log_alpha)))

I do not understand, how this is derived. Is the mean of prior and posterior of the weights assumed to be zero? I do not have another explanation.

I hope you can explain that to me. Thank you very much in advance!

kumar-shridhar commented 4 years ago

Hi @maltet96 ,

The derivation of KL divergence is well explained here.