piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
918 stars 107 forks source link

torch.nn.Sequential with BliTZ #113

Closed jeiglsperger closed 1 year ago

jeiglsperger commented 1 year ago

I created a LSTM with BLiTZ and torch.nn.Sequential the following way:

model = []
model.append(PrepareForlstm())
        for layer in range(n_layers):
            if layer == 0:
                model.append(BayesianLSTM(in_features=n_feature, out_features=lstm_hidden_dim, bias=bias,
                                          prior_sigma_1=prior_sigma_1, prior_sigma_2=prior_sigma_2, prior_pi=prior_pi,
                                          posterior_mu_init=posterior_mu_init, posterior_rho_init=posterior_rho_init,
                                          freeze=freeze, peephole=peephole))
            else:
                model.append(BayesianLSTM(in_features=lstm_hidden_dim, out_features=lstm_hidden_dim, bias=bias,
                                          prior_sigma_1=prior_sigma_1, prior_sigma_2=prior_sigma_2, prior_pi=prior_pi,
                                          posterior_mu_init=posterior_mu_init, posterior_rho_init=posterior_rho_init,
                                          freeze=freeze, peephole=peephole))
        model.append(GetOutputZero())
        model.append(PrepareForDropout())
        model.append(torch.nn.Dropout(p))
        model.append(BayesianLinear(in_features=lstm_hidden_dim, out_features=self.n_outputs))
return torch.nn.Sequential(*model)

with the self made classes PrepareForlstm(), GetOutputZero(), PrepareForDropout():

# Class to only get the first output of a lstm layer
class GetOutputZero(torch.nn.Module):
    def __init__(self):
        super(GetOutputZero, self).__init__()

    def forward(self, x):
        lstm_out, (hn, cn) = x
return lstm_out

# Class to reshape the data suitable for lstm layer
class PrepareForlstm(torch.nn.Module):
    def __init__(self):
        super(PrepareForlstm, self).__init__()

    def forward(self, x):
return x.view(x.shape[0], x.shape[1], -1)

# Class to reshape data suitable for dropout layer
class PrepareForDropout(torch.nn.Module):
    def __init__(self):
        super(PrepareForDropout, self).__init__()

    def forward(self, lstm_out):
return lstm_out[:, -1, :]

But when I input a torch tensor, I get the error AttributeError: 'tuple' object has no attribute 'size', so somewhere between the layers no tensors but a tuple is passed, but I can't figure out where and what's my mistake. Does anyone else already have experience with BLiTZ and torch.nn.Sequential?

jeiglsperger commented 1 year ago

I figured it out myself. I had to add the GetOutputZero() layer between every BayesianLSTM layer, so I pass the output to the next layer, not the output and the hidden structure:

model = []
model.append(PrepareForlstm())
for layer in range(n_layers):
    if layer == 0:
        model.append(BayesianLSTM(in_features=n_feature, out_features=lstm_hidden_dim, bias=bias,
                                  prior_sigma_1=prior_sigma_1, prior_sigma_2=prior_sigma_2, prior_pi=prior_pi,
                                  posterior_mu_init=posterior_mu_init, posterior_rho_init=posterior_rho_init,
                                  freeze=freeze, peephole=peephole))
        model.append(GetOutputZero())
    else:
        model.append(BayesianLSTM(in_features=lstm_hidden_dim, out_features=lstm_hidden_dim, bias=bias,
                                  prior_sigma_1=prior_sigma_1, prior_sigma_2=prior_sigma_2, prior_pi=prior_pi,
                                  posterior_mu_init=posterior_mu_init, posterior_rho_init=posterior_rho_init,
                                  freeze=freeze, peephole=peephole))
        model.append(GetOutputZero())
model.append(PrepareForDropout())
model.append(torch.nn.Dropout(p))
model.append(BayesianLinear(in_features=lstm_hidden_dim, out_features=self.n_outputs))

return torch.nn.Sequential(*model)