piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
918 stars 107 forks source link

What does sample_nbr mean in the sample_elbo method ? #79

Closed CatNofishing closed 3 years ago

CatNofishing commented 3 years ago

image

Sorry , i did not find doc about this,

piEsposito commented 3 years ago

Hello @CatNofishing - and thank you so much for using BLiTZ.

The sample elbo method samples de elbo loss of our variational estimator by performing sample forward passes and averaging the fitting loss and the weighted complexity loss of the model.

Does that answer your question?

best regards, -pi

CatNofishing commented 3 years ago

@piEsposito Good ,thanks your reply. I use blitz to establish a simple Bayesian network to predict , I tried many times, but the prediction result was poor. On the contrary, I used a common neural network with the same structure to predict very well. Can you help me. It is really difficult to predict it. Maybe there is something wrong with my parameter setting. The predict result is the following figure. And my code.

code

from blitz.modules import BayesianLinear
from blitz.utils import variational_estimator
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
import os

os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

@variational_estimator
class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Sequential(
            BayesianLinear(30, 10),
            nn.ReLU(),
            BayesianLinear(10, 5),
            nn.ReLU(),
            BayesianLinear(5, 1)
        )

    def forward(self, x_train):
        x = self.layer(x_train)
        return x

if __name__ == '__main__':

    train_x = torch.tensor(np.load('./train_x.npy'),
                           dtype=torch.float)  # size(746,30)
    train_y = torch.tensor(np.load('./train_y.npy'),
                           dtype=torch.float).reshape(-1, 1)  # size(746,1)
    test_x = torch.tensor(np.load('./test_x.npy'),
                          dtype=torch.float)  # size(320,30)
    test_y = torch.tensor(np.load('./test_y.npy'),
                          dtype=torch.float).reshape(-1, 1)  # size(320,1)

    model = Model()
    criterio = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=1e-2)

    for epoch in range(1000):
        predict_y = model(train_x)
        #loss = criterio(y_train.reshape(-1,1), predict_y)
        loss = model.sample_elbo(inputs=train_x,
                                 labels=train_y,
                                 criterion=criterio,
                                 sample_nbr=3)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if epoch % 1000 == 0:
            print(loss)

    plt.plot(test_y.numpy().reshape(-1), label='actual')
    plt.plot(model(test_x).detach().clone().reshape(-1), label='predict')
    plt.legend()
    plt.show()

data

https://github.com/CatNofishing/python-demo/tree/master/data

result

image

sansiro77 commented 3 years ago

In the original paper, the negative log-likelihood term in the total loss function is a sum over samples. As the default criterion is the average, the KL term should be scaled. Also, if you use minibatch when training, complexity_cost_weight should be corrected accordingly.

loss = model.sample_elbo(inputs=train_x, 
                         labels=train_y, 
                         criterion=criterio, 
                         sample_nbr=10, 
                         complexity_cost_weight=1./train_x.shape[0])

Figure_1

CatNofishing commented 3 years ago

@sansiro77 thanks for your reply😂

piEsposito commented 3 years ago

I think that's solved. Thank you @sansiro77 again. You are my hero.