Intercept - Githubissues

jbuxbaum commented 3 months ago

Hi Noah,

Thank you so much for creating/maintaining/enhancing this package.

Quick question -- is it possible to subtract off the intercept? Or otherwise specify which variable is to serve as the omitted variable (without doing the omitting manually)?

Thanks lots, Jason

ngreifer commented 3 months ago

Hi Jason,

There isn't a way to do this, and I would dissuade you from trying.

For optimization-based methods, there isn't an intercept anyway, and it's important to retain all levels of a categorical variable to ensure all of them are treated equally in the optimization. The same is true for machine learning methods. These methods usually don't require full-rank covariate matrices.

For parametric modeling (GLM), it is always better to include the intercept in the model than it is to omit it. Even if you know the true intercept in the data-generating model is 0 (and why would that ever be the case?), you will get better performing weights when allowing the intercept to be estimated. If you are not concerned with changing the model but just want to omit a reference category for a categorical predictor, that suggests that you are interested in viewing and interpreting the propensity score model. I recommend against this; it does not provide useful substantive information and can be misleading. The size and significance of coefficients should not determine whether they are included in the model; rather, the balance achieved by the weights and their variability should be used to assess whether a model specification is adequate. If you really want to see and interpret the propensity score model, then you will have to fit it on your own outside of weightit().

These are my best reasons for not allowing removal of the intercept, but if you have a good reason to, let me know and I'll try to help or explain why I disagree.

jbuxbaum commented 3 months ago

This is an incredibly helpful explanation. Thanks so much, Noah.

ngreifer / WeightIt

Intercept #68