Rethink our `Prior` class and how to work with priors in general

tomicapretto commented 3 years ago

At some point, we will have to update how we ask users to specify prior distributions. I think our approach is quite good for the current state of Bambi (basically, for GLMMs), but it won't be satisfactory in the nearby future if we add planned features such as splines.

Priors and current features

I think our system should be better in the two following scenarios (there may be more, these are the ones that come to my mind right now)

Specify the hyperprior for the standard deviation of a group-specific effect.

Right now, you can't modify just the hyperprior, you have to specify each term prior as a whole. For example, if we want to have a HalfStudentT hyperprior with certain parameter values for the sigma parameter we have to rewrite the whole prior:

priors = {
  "x|group": Prior("Normal", mu=0, sigma=Prior("HalfStudentT", nu=4, sigma=2))
}
Model("y ~ x + (x|group)", data, priors=priors)

For this case, brms has set_prior("<prior>", class = "sd", group = "<group>").

Priors for correlation matrices of group-specific effects.

Since #360 one can have correlated priors for group-specific terms. These effects have a multivariate normal prior with the correlation matrix having an LKJ prior. The eta parameter of this LKJ prior can be passed with the priors_cor argument in Model(). However, I think this shouldn't be the way to go in the future. Instead of adding more and more arguments, we should come up with a way to pass all the prior specifications to the priors argument.

Here brms has set_prior("lkj(2)", class = "cor").

Priors and planned features

If we manage to implement things like

Splines
Non-linear effects
Gaussian processes
More?

we will need to have something prepared to make things simpler when it comes to specifying priors specific to these models.

This section of the brms documentation is a good reference I think because it makes you realize how many things need to be taken into account when deciding how we handle priors. I'm not saying we should follow the same approach, but we should take it as a reference.

aloctavodia commented 3 years ago

Following brms API, has at least two advantages. For the users, similar API between packages. For us, we do not have to think a lot about a good API for all those cases. I still would like to better learn about brms API.

An alternative, off the top of my head, is to offer something closer to PyMC3 syntax. So it would be possible to start from a bambi model and progressively turn it into a PyMC3 model, or the other way around.

tomicapretto commented 2 years ago

One alternative could use a target argument somewhere

For example:

bmb.Prior("Normal", mu=0, sigma=1, target="mu")

but that mixes the arguments of the prior distribution with the target argument. I'm not sure if that's intuitive and/or safe. Maybe

priors = {
    "predictor": [
        {
            "prior": bmb.Prior("Normal", mu=0, sigma=1), "target": "mu"
        }, 
        {
            "prior": bmb.Prior("HalfNormal", sigma=1), "target": "sigma"
        }
    ]
}

but it can be too verbose?

Another idea is to match arguments positionally in a list/tuple

priors = {
    "predictor": [[bmb.Prior("Normal", mu=0, sigma=1), "mu"],[bmb.Prior("HalfNormal", sigma=1), "sigma"]]
}

but I still feel it may be confusing.

Either consciously or unconsciously I'm comparing with set_prior from brms in my mind. But our current API always asks users to pass the term (or classes of terms) together with the prior i.e. {"common": bmb.Prior(...)} while set_prior() has a default class value and you optionally pass the name of the terms..

Let's keep thinking about better approaches...

tomicapretto commented 2 years ago

We also need to consider multivariate terms, like the ones used in the new categorical family or the ones that will be used in ordinal models.

So far, I think the following would work to specify different means and sds for a multivariate parameter

prior = bmb.Prior("Normal", mu=[0, 0.5, 1], sigma=[0.5, 0.5, 0.5])

But it's not possible to different families as priors. This is because we assign the priors for all the dimensions in the multivariate term as a single PyMC distribution. I don't know if it's going to be possible/straightforward to allow different families for different dimensions.

For an example of what I'm saying see the first model in this article https://solomonkurz.netlify.app/post/2021-12-29-notes-on-the-bayesian-cumulative-probit/

tomicapretto commented 2 years ago

I've found brmp (like brms but in Python). It's interesting to review this documentation about Priors https://brmp.readthedocs.io/en/latest/priors.html. It seems to follow the brms way of doing it (as expected)

bambinos / bambi

Rethink our `Prior` class and how to work with priors in general #365

Priors and current features

Priors and planned features