Per-layer prior precision for linear last layer

jonasvj commented 1 year ago

Hi,

Thank you for this great library :-)

I was wondering, if there is a reason you cannot specify per-layer prior precisions for the last-layer Laplace approximation? I know this sounds contradictory, but for e.g. a linear last layer I think the n_layers attribute is actually 2, since you have one parameter group for the weights and one for the biases.

In my case, I was trying to set different prior precisions for the weights and biases but got an error from prior_precision_diag in LLLaplace when calling fit_laplace. I could implement this with a full diagonal prior, but I guess I would be optimizing over n_params parameters in opimize_prior_precision later instead of just 2.

wiseodd commented 1 year ago

How about doing it "manually", similar to the one in the example? https://github.com/AlexImmer/Laplace/blob/8f24a72cfbcd0829c8cdce5452fda5c696693026/examples/regression_example.py#L38-L44 In your case, it would be something like:

Define your 2 scalar variables tau_w, tau_b to optimize.
Define your optimizer opt
Repeat 3.1. Create a diagonal prior precision based on the two scalar. E.g. the first n_w prior precisions are all tau_w and the remaining are all tau_b. Make sure that this is differentiable, e.g. use torch.ones and indexing. 3.2. Feed this diagonal prior prec. into la.log_marginal_likelihood and call backward. 3.3. opt.step() on tau_w and tau_b.

jonasvj commented 1 year ago

Thank you for the quick reply and suggestion. That seems like a good way to do it, thanks!

aleximmer / Laplace

Per-layer prior precision for linear last layer #116