CatNofishing commented 3 years ago

Sorry , i did not find doc about this,

piEsposito commented 3 years ago

Hello @CatNofishing - and thank you so much for using BLiTZ.

The sample elbo method samples de elbo loss of our variational estimator by performing sample forward passes and averaging the fitting loss and the weighted complexity loss of the model.

Does that answer your question?

best regards, -pi

CatNofishing commented 3 years ago

@piEsposito Good ,thanks your reply. I use blitz to establish a simple Bayesian network to predict , I tried many times, but the prediction result was poor. On the contrary, I used a common neural network with the same structure to predict very well. Can you help me. It is really difficult to predict it. Maybe there is something wrong with my parameter setting. The predict result is the following figure. And my code.

code

from blitz.modules import BayesianLinear
from blitz.utils import variational_estimator
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
import os

os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

@variational_estimator
class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer = nn.Sequential(
            BayesianLinear(30, 10),
            nn.ReLU(),
            BayesianLinear(10, 5),
            nn.ReLU(),
            BayesianLinear(5, 1)
        )

    def forward(self, x_train):
        x = self.layer(x_train)
        return x

if __name__ == '__main__':

    train_x = torch.tensor(np.load('./train_x.npy'),
                           dtype=torch.float)  # size(746,30)
    train_y = torch.tensor(np.load('./train_y.npy'),
                           dtype=torch.float).reshape(-1, 1)  # size(746,1)
    test_x = torch.tensor(np.load('./test_x.npy'),
                          dtype=torch.float)  # size(320,30)
    test_y = torch.tensor(np.load('./test_y.npy'),
                          dtype=torch.float).reshape(-1, 1)  # size(320,1)

    model = Model()
    criterio = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=1e-2)

    for epoch in range(1000):
        predict_y = model(train_x)
        #loss = criterio(y_train.reshape(-1,1), predict_y)
        loss = model.sample_elbo(inputs=train_x,
                                 labels=train_y,
                                 criterion=criterio,
                                 sample_nbr=3)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if epoch % 1000 == 0:
            print(loss)

    plt.plot(test_y.numpy().reshape(-1), label='actual')
    plt.plot(model(test_x).detach().clone().reshape(-1), label='predict')
    plt.legend()
    plt.show()

data

https://github.com/CatNofishing/python-demo/tree/master/data

result

sansiro77 commented 3 years ago

In the original paper, the negative log-likelihood term in the total loss function is a sum over samples. As the default criterion is the average, the KL term should be scaled. Also, if you use minibatch when training, complexity_cost_weight should be corrected accordingly.

loss = model.sample_elbo(inputs=train_x, 
                         labels=train_y, 
                         criterion=criterio, 
                         sample_nbr=10, 
                         complexity_cost_weight=1./train_x.shape[0])

Figure_1

CatNofishing commented 3 years ago

@sansiro77 thanks for your reply😂

piEsposito commented 3 years ago

I think that's solved. Thank you @sansiro77 again. You are my hero.

piEsposito / blitz-bayesian-deep-learning

What does sample_nbr mean in the sample_elbo method ? #79

code

data

result