Old version of KL Divergence

Hi @kumar-shridhar @Piyush-555 ,

I am currently working on a project utilizing BayesByBackprop for image reconstruction with autoencoders. It does work good. However, I have a question regarding the old version of the calculation of the KL Divergence:

def kl_loss(self): return self.weight.nelement() / self.log_alpha.nelement() * calculate_kl(self.log_alpha)

def calculate_kl(log_alpha): return 0.5 * torch.sum(torch.log1p(torch.exp(-log_alpha)))

I do not understand, how this is derived. Is the mean of prior and posterior of the weights assumed to be zero? I do not have another explanation.

I hope you can explain that to me. Thank you very much in advance!

kumar-shridhar / PyTorch-BayesianCNN

Old version of KL Divergence #57