kumar-shridhar / PyTorch-BayesianCNN

Bayesian Convolutional Neural Network with Variational Inference based on Bayes by Backprop in PyTorch.
MIT License
1.42k stars 323 forks source link

Loss function #50

Open quangnhien opened 4 years ago

quangnhien commented 4 years ago

image I see that you have multiple nllLoss with train_size. This makes the value of the loss is very large. I don't understand why do you do that?

You can explain for me?.

yichong96 commented 3 years ago

I am also slightly confused regarding this implementation, In the bayes by backprop paper by blundell, the NLL loss is an expectation over the variational posterior weights. I am not sure why there is a multiplication by self.train_size

jiaohuix commented 3 years ago

Nll_loss here is used to calculate the negative log likelihood - logP(D|wi) in the paper of blundell, In my opinion, here should not apply log_softmax ,calculate mean probs and then use mean probs and targets to get cross entropy(softmax+log+nll_loss). Instead, for each logits obtained from the sampling model, the cross-entropy should be directly calculated by logits and targets to obtain multiple negative log likelihood, and sum these losses according to the number of samples. This is the third term of the formula F(D,θ) in this paper. In addition, the name of the class should not be ELBO, beta*kl here corresponds to the -elbo loss in paper, but the kl here is actually calculate kl divergence between q and the prior, and the paper is to minimize -elbo, namely -elbo = logq (wi|θ) - logP (wi)