Use `_glm` functions when applicable.

WardBrian / scikit-stan

A sklearn style interface to Stan regression models

https://scikit-stan.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

25 stars 0 forks source link

Use `_glm` functions when applicable. #22

Open WardBrian opened 1 year ago

WardBrian commented 1 year ago

For example, normal_id_glm, or bernoulli_logit_glm. These are much faster due to reduced autodiff.

We may need to re-factor how our models are structured to best use these. Also, these assume a separate intercept, so we will need to avoid using X with a column of all ones/disable when using fit_intercept=False

bob-carpenter commented 1 year ago

The column-of-1s trick shortens the notation, but it's more efficient to not multiply by 1 and just use alpha + x * beta. Also, it allows us to put a broader prior on alpha than on the beta[k], which is important if the data is offset from 0 (literally in linear regression, which becomes 0.5 in a logistic regression, etc.)

You usually want a broader prior on the intercept than on coefficients, too, because it soaks up all the excess from all the other effects.

Is the fit_intercept = False thing from scikit-learn? Unless your y variables are standardized, you usually want an intercept, so I would strongly discourage people from this option in the doc.

WardBrian commented 1 year ago

fit_intercept=False is a feature in some of the other scikit-learn estimators. The recommended usage is not to actually not fit an intercept, but rather for instances where your design matrix includes the all-1s column. In particular, most of the Python libraries which support the Wilkinson formula syntax produce X with a column of 1s by default.

bob-carpenter commented 1 year ago

recommended usage is not to actually not fit an intercept, but rather for instances where your design matrix includes the all-1s column.

That makes more sense. I had to look up that Wilkinson formula syntax is what lme4 and brms use. Wilkinson was also behind the grammar of graphics which is the basis for ggplot2.

If the column of 1s is going to be forced on you, do you at least know which column it is so that you can define separate priors for intercepts?

WardBrian commented 1 year ago

I think the convention is that the column of 1s comes first. When talking to @jgabry he mentioned that RStanArm checks for this and removes that first column, preferring to use the true intercept parameter.

In the long term, I'd like to make a package using a formula implementation like formulae and scikit-stan as a "backend" to make something which looks very much like rstanarm. In that package, I am planning on doing that same chop, but for the lower-level interface we wanted to allow the customization if necessary.