Clarify how the softmax is handled for classification

edaxberger commented 3 years ago

Currently it's not really clear how the final softmax is dealt with in the classification case, which might lead to confusion / unintentional misuse of the library.

There's two things to clarify:

That the MAP model put into Laplace shouldn't apply a softmax (either via a nn.Sofmax() layer in the model or a F.softmax() call in the overwritten forward pass) but return the logits instead. This could probably most easily be fixed by clarifying it in the documentation/readme and additionally raising a warning if the model outputs on the training set during fit() lie in [0,1] and sum to 1.
That the Laplace model applies the softmax internally when making predictions and that, therefore, the user shouldn't apply another softmax on top. Here we can probably only improve the documentation.

youkaichao commented 2 years ago

Currently it raises an Exception like Extension saving to kflr does not have an extension for Module <class 'torch.nn.modules.activation.Softmax'> :(

edaxberger commented 2 years ago

@youkaichao Thanks for letting us know -- this exception comes from the BackPACK backend we're using to compute second-order information; not sure if the ASDL backend throws an exception when using a softmax model (I'd assume it does not).

For now, just make sure that the MAP model does not have a softmax layer at the end for the library to work properly.

aleximmer / Laplace

Clarify how the softmax is handled for classification #31