piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
931 stars 106 forks source link

Do you have cuda support? #9

Closed Archelunch closed 3 years ago

Archelunch commented 4 years ago

Hi! If I run model on GPU, it's cause RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_addmm

Do you have plans to add cuda support?

piEsposito commented 4 years ago

It should be supporting cuda. Can you give me more info in your code? So I can fix it. Thank you so much for bringing this issue.

-Pi

Archelunch commented 4 years ago

Model

class SudokuSolver(nn.Module):
    def __init__(self, constraint_mask, n=9, hidden1=128):
        super(SudokuSolver, self).__init__()
        self.constraint_mask = constraint_mask.unsqueeze(-1).unsqueeze(0)
        self.n = n
        self.hidden1 = hidden1

        # Feature vector is the 3 constraints
        self.input_size = 3 * n

        self.l1 = BayesianLinear(self.input_size,
                            self.hidden1, bias=False)
        self.a1 = nn.ReLU()
        self.l2 = BayesianLinear(self.hidden1,
                            n, bias=False)
        self.softmax = nn.Softmax(dim=1)

    # x is a (batch, n^2, n) tensor
    def forward(self, x, return_orig=False):
        n = self.n
        bts = x.shape[0]
        c = self.constraint_mask
        min_empty = (x.sum(dim=2) == 0).sum(dim=1).max()
        x_pred = x.clone()
        for a in range(min_empty):
            # score empty numbers
            #print(x.view(bts, 1, 1, n * n, n).size(), x.unsqueeze(1).unsqueeze(1).size())
            constraints = (x.unsqueeze(1).unsqueeze(1) * c).sum(dim=3)
            # empty cells
            empty_mask = (x.sum(dim=2) == 0)

            f = constraints.reshape(bts, n * n, 3 * n)
            y_ = self.l1(f[empty_mask])
            y_ = self.l2(self.a1(y_))

            s_ = self.softmax(y_)

            # Score the rows
            x_pred[empty_mask] = s_

            s = torch.zeros_like(x_pred)
            s[empty_mask] = s_
            # find most probable guess
            score, score_pos = s.max(dim=2)
            mmax = score.max(dim=1)[1]
            # fill it in
            nz = empty_mask.sum(dim=1).nonzero().view(-1)
            mmax_ = mmax[nz]
            ones = torch.ones(nz.shape[0])
            x.index_put_((nz, mmax_, score_pos[nz, mmax_]), ones)
        if return_orig:
            return x
        else:
            return x_pred

init

criterion = nn.MSELoss()
sudoku_solver = SudokuSolver(constraint_mask).cuda()

optimizer = optim.Adam(sudoku_solver.parameters(), lr=0.001)

Eval method (X and y on cuda)

def evaluate_solver(solver,X,y,):
    preds = solver(X)
    errors = preds.max(dim=2)[1]\
                != y.max(dim=2)[1]
    return errors

evaluate_regression(sudoku_solver, x, y).sum()

piEsposito commented 4 years ago

Thank you. Can you also give me the shape of tensors X, and y for me to randomly generate them and test the evaluate solver function? And the full message of the error? It would help me a lot on addressing that issue.

Archelunch commented 4 years ago

You could check my notebook on Kaggle - https://www.kaggle.com/archelunch/sudoku-solver

piEsposito commented 4 years ago

Sorry for the late reply... It actually supports cuda, as you can see running this:

class net(nn.Module):
    def __init__(self):
        super().__init__()
        self.l1 = BayesianLinear(27, 7)
        self.l2 = BayesianLinear(7, 9)
    def forward(self, x):
        x_ = self.l1(x)
        x_ = self.l2(x_)
        return x_

yee = net().cuda()
t = torch.ones(10, 27).cuda()

yee(t)
gongguri commented 4 years ago

Hi, thanks for the great repo! I'm trying to test lstm example and got the same error when I tried to use cuda.

I tried code in './blitz-bayesian-deep-learning/blitz/examples/stocks-blstm.ipynb', but when I changed net = NN() -> net = NN().cuda(), I got an error in training part as follows:

File "C:\Users\gongg\Anaconda3\envs\pytorch_1_4_py37\lib\site-packages\blitz\modules\lstm_bayesian_layer.py", line 131, in forward gates = x_t @ self.weight_ih + h_t @ self.weight_hh + self.bias RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_mm

Did I something wrong? or Could you test your example code with cuda?

piEsposito commented 4 years ago

Hello, and thank you for using this repo.

Note that this operation was not intended to run on cuda (despite being possible): due to the fact that it is very lightweight, I didn't even bother to make it cuda viable. Anyway, here are instructions to adapting the training loop (changes alike to that will be needed to change the whole example - and a contribution with this changes would be very appreaciated).

When perform the NN operations in cuda, we don't just have to send the model to the GPU, but also the data we are treating. So the necessary changes, just for the training operations would be:

First, to create the device and send the model to this device:

Xs, ys = create_timestamps_ds(close_prices)
X_train, X_test, y_train, y_test = train_test_split(Xs,
                                                    ys,
                                                    test_size=.25,
                                                    random_state=42,
                                                    shuffle=False)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
ds = torch.utils.data.TensorDataset(X_train, y_train)
dataloader_train = torch.utils.data.DataLoader(ds, batch_size=8, shuffle=True)

net = NN().to(device)

criterion = nn.MSELoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

Then, on the training end evaluate-on-training operation, the data that will pass through the network should be also sent to our device:

iteration = 0
for epoch in range(10):
    for i, (datapoints, labels) in enumerate(dataloader_train):
        optimizer.zero_grad()

        loss = net.sample_elbo(inputs=datapoints.to(device),
                               labels=labels.to(device),
                               criterion=criterion,
                               sample_nbr=3)
        loss.backward()
        optimizer.step()

        iteration += 1
        if iteration%250==0:
            preds_test = net(X_test.to(device))[:,0].unsqueeze(1)
            loss_test = criterion(preds_test, y_test.to(device))
            print("Iteration: {} Val-loss: {:.4f}".format(str(iteration), loss_test))

Notice that, on the other operations, (as confidence interval evaluating) changes alike to that will be needed.

piEsposito commented 3 years ago

Closing due to inactivity.