piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
918 stars 107 forks source link

loss in regression task #83

Closed flydephone closed 3 years ago

flydephone commented 3 years ago

The sample_elbo function is constructed as follow:

‘’‘ The ELBO Loss consists of the sum of the KL Divergence of the model (explained above, interpreted as a "complexity part" of the loss) with the actual criterion - (loss function) of optimization of our model (the performance part of the loss). ’‘’ 。

But most others calculate ELBO = - log_Likelihood + KL. I don’t think “the actual criterion” and “ - log_Likelihood” are equivalent。

Specifically,I want to know why the loss in the regression task includes mse as follow:

def sample_elbo(self, inputs, labels, criterion, sample_nbr, complexity_costweight=1): loss = 0 for in range(sample_nbr): outputs = self(inputs) loss += criterion(outputs, labels) loss += self.nn_kl_divergence() * complexity_cost_weight return loss / sample_nbr

I think the log_gaussian should be used since the lose = - log_Likelihood + KL. When the task is classification,the ”log_Likelihood“ can be calculated with “torch.nn.CrossEntropyLoss()”. But the ”torch.nn.MSELoss()“ can't give the ”log_Likelihood“.

piEsposito commented 3 years ago

We use the MSELoss only for regression problems. We keep the Kl Divergence as the complexity cost to try to learn those the probabilistic weights while having a some uncertainty on our network. So that's why we use those different parts of losses.

The criterion is the "fitting cost" and can the plug and played with any torch function of your.