ENH: Adding INLA to PyMC step 1: get a Laplace Approximation

theorashid commented 1 year ago

Before

No response

After

No response

Context for the issue:

cc: @bwengals @athowes @junpenglao

previous closed LA/INLA issues: https://github.com/pymc-devs/pymc/pull/4847 and https://github.com/pymc-devs/pymc/issues/3242

There are three steps to getting R-INLAish in pymc:

Laplace approximation
embedded (marginal) laplace approximation with other inference on the hyperparameters (probably with HMC like Stan-crew looked into, but R-INLA uses numerical integration and TMB uses empirical Bayes)
sparse cholesky in pytensor to get R-INLA speed (see Dan Simpson's blog, maybe pymc can interface with CHOLMOD).

The first step is getting a Laplace approximation. This is great for models like certain GLMs or stuff with splines where a lot of the posteriors are Gaussians. This can be bundled into the ADVI interface like numpyro do. Looks like this PR got fairly close in pymc3.

Hopefully it won't be too difficult for someone who knows the ADVI pymc interface well. It's pretty fresh though so should probably be put in pymc-experimental first. If anyone wants to attempt parts 2 or 3, that should definitely be in pymc-experimental.

Some resources:

My colleagues Adam Howes' WIP thesis chapter on (R-)INLA: https://athowes.github.io/thesis/naomi-aghq.html
Dan Simpson's blog: https://dansblog.netlify.app/posts/2022-03-22-a-linear-mixed-effects-model/a-linear-mixed-effects-model
Junpeng's attempt in pymc3: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/main/Ports/Laplace%20approximation%20in%20pymc3.ipynb
That Stan team paper: https://arxiv.org/abs/2004.12550

bwengals commented 1 year ago

Am really interested in particular to the LA + HMC option, seems like the best “fit” within the rest of PyMC. It would be amazing to specify which parts of the model you want to approx with LA. Though a full INLA clone type project would be super cool too. Would INLA benefit from autodiff from pytensor (dont know if it currently relies on AD)?

plenty of nuances with the HMC approach

RE the sparse stuff, this project adds a banded matrix cholesky and other related ops to tensorflow with gradients, might be another place to start?

athowes commented 1 year ago

in particular to the LA + HMC option

If useful to see, the tmbstan R package (described in this paper) implements the HMC with LA option in the function tmbstan::tmbstan(..., laplace = TRUE).

dont know if it currently relies on AD

R-INLA currently doesn't use AD. See this discussion thread on the R-INLA user Google group.

pymc-devs / pymc