ngreifer / WeightIt

WeightIt: an R package for propensity score weighting
https://ngreifer.github.io/WeightIt/
101 stars 12 forks source link

by = for groups with covariates that have no variance #58

Open mkrasmus opened 3 months ago

mkrasmus commented 3 months ago

Hi @ngreifer do you have any advice for using WeightIt when there are groups (variable for 'by' argument) that have predictors in the PS model that have no variation (ie redundant covariates).

So for example if using 'by = race' and predictors predvars <- c('age', 'educ','nodegree', 'married', 're74', 're75')

but married had no variation for race at one level, say. In a loop with a separate PS model for each level of race we would do something like the following to trim out 'married' if there were no variance in values:

cdpred <- predvars[sapply(predvars, function(v) length(unique(lalonde[[v]])) > 1)] form <- as.formula(paste0("group ~ ", paste0(c(cdpred), collapse = " + "))) mod <- glm(form, data = lalonde, family = "binomial")

I'm wondering if there is something in WeightIt that handles a lack of variation in predictors if they spring up at one level, or if its a problem for 'glm' but maybe not the other methods. Any thoughts greatly appreciated, thanks.

ngreifer commented 3 months ago

weightit() automatically removes redundant predictors (i.e., those that have no variation or covary exactly with others) before estimating weights, so there should be no problem with any method. For example, if you include the by variable in the model formula, that should induce a redundancy that is automatically taken care of. Different methods do this in different ways, but I have a custom function (that is actually user-facing should you want to use it) that removes linearly dependent columns prior to estimating the weights.