aleximmer / Laplace

Laplace approximations for Deep Learning.
https://aleximmer.github.io/Laplace
MIT License
436 stars 63 forks source link

underconfident predictions #141

Closed Snowmanda closed 5 months ago

Snowmanda commented 5 months ago

I try to improve the calibration of image classification, but I struggle with vastly underconfident predictions.

Laplace: All_Labels.pdf Resnet: All_Labels.pdf

Something else that strikes me as odd is that I see that the laplace method likes some values: For example: 0.07400000095367432 is assigned very often, with no granularity

Label: 0 0.07400000095367432,0.07400000095367432,0.07400000095367432,0.07400000095367432,0.07900000363588333,0.07400000095367432,0.18199999630451202,0.07400000095367432,0.07400000095367432,0.07400000095367432,0.07400000095367432,0.07400000095367432

Label: 0 0.16599999368190765,0.08100000023841858,0.07400000095367432,0.07400000095367432,0.07599999755620956,0.07400000095367432,0.07999999821186066,0.07500000298023224,0.07400000095367432,0.07400000095367432,0.07599999755620956,0.07400000095367432

My model consists of ResNet-16 with 3 fully connected Layers behind it.

        if activate_laplace == True:
            print("Laplace starts")
            la_trainloader, la_valloader, la_testloader, nr_channels, nr_labels = choose_dataset(dataset_name, 64)
            if laplace_method == "lastlayer":
                la = Laplace(model, 'classification',
                             subset_of_weights='last_layer',
                             hessian_structure='kron')
                la.fit(la_trainloader)
                la.optimize_prior_precision(method='marglik')

            if laplace_method == "influence":
                subnetwork_mask = LargestMagnitudeSubnetMask(model, n_params_subnet=256)
                subnetwork_indices = subnetwork_mask.select()
                la = Laplace(model, 'classification',
                subset_of_weights = 'subnetwork',
                hessian_structure = 'full',
                subnetwork_indices = subnetwork_indices.type(torch.LongTensor),
                backend=AsdlGGN)
                la.fit(la_trainloader)
                la.optimize_prior_precision(method='marglik')

Is this behaviour intended, or Can you see where I made a mistake. Thanks for any help!

Snowmanda commented 5 months ago

I tried the "CV" method and got very similar results. Here the output on the MNIST dataset (0,99 accuracy.)

pretrained-model: Combined resnet 16 (last layer removed) with 3 fullyconnected layers

class FullyCon(nn.Module):

def __init__(self, nr_hiddenlayer_width: int = 512) -> None:
    super().__init__()

    self.fc1 = nn.Linear(512, nr_hiddenlayer_width)
    self.fc2 = nn.Linear(nr_hiddenlayer_width, nr_hiddenlayer_width)
    self.fc3 = nn.Linear(nr_hiddenlayer_width, nr_labels)

    self.relu = nn.ReLU()
    # self.softmax = nn.Softmax(dim=1)

def forward(self, x: Tensor) -> Tensor:
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    x = self.relu(x)
    x = self.fc3(x)

    return x

class CombinedModel(nn.Module): def init(self, modelA, modelB): super().init() self.modelA = modelA self.modelB = modelB

def forward(self, x):
    x1 = self.modelA(x)
    x2 = self.modelB(x1)
    return x2

if activate_laplace == True: print("Laplace starts") la_trainloader, la_valloader, la_testloader, nr_channels, nr_labels = choose_dataset(dataset_name, 64) if laplace_method == "lastlayer": la = Laplace(model, 'classification', subset_of_weights='last_layer', hessian_structure='kron') la.fit(la_trainloader) la.optimize_prior_precision(method='CV', val_loader = la_valloader)

Screenshot_20240119_153310

Snowmanda commented 5 months ago

Found the problem: Softmax was applied twice.