ngreifer / WeightIt

WeightIt: an R package for propensity score weighting
https://ngreifer.github.io/WeightIt/
102 stars 12 forks source link

Understanding weights #35

Closed JoshSchramm94 closed 1 year ago

JoshSchramm94 commented 2 years ago

I was reading through the vignette, and I still have not figured out, what these weights exactly mean. I used to run the code with the lalonde dataset and when using sum for the weights, this equals 371.

W.out <- weightit(treat ~ age + educ + race + married + nodegree + re74 + re75, data = lalonde, estimand = "ATT", method = "ps") summary(W.out) lalonde <- lalonde

sum(W.out$weights)

I am a little bit surprised that these do not equal the sample size or something similar, but maybe this is my fault because I associate them with survey weights.

And I can also understand that you used "ebal" in the vignette because of the high variability of the weights. However, this equals a sum of 614 again.

When I run WeightIt for my study (N = 117), the sum of my weights is around 230, which I think is quite high, however, I have issues to interpretate them. It would be great if someone of you could help me out.

Thanks a lot in advance

ngreifer commented 2 years ago

The sum of the weights is immaterial to what they do or mean. The functionality of the weights is unaffected by any multiplicative transformation of them. Try multiplying the weights in one of the treatment groups by 100; you will find that the balance statistics and difference in outcome means are the same. (The scaling of the weights does affect treatment effect estimation when including covariates but no treatment-covariate interactions in the outcome model; more on this later.)

You should not think of these weights as having the same purpose as survey weights; they are balancing weights in the sense that they yield covariate balance between the treatment groups and between each treatment group and a target population. The weights themselves do not carry information about the size of the population the effect is meant to generalize to. There is a measure that represents the approximate size an unweighted sample would be that carried the same precision as the weighted sample in question; this is known as the effective sample size (ESS) and is displayed when running summary(). The ESS is also invariant to multiplicative transformations of the weights. When using a weighted outcome regression to estimate teh tretament effect, it is critical that as robust standard errors is used; this also removes the dependence of the standard error on the scale of the weights. The usual weighted least squares standard errors are affected by the scale of the weights, which is one reason they should not be used.

With entropy balancing and other optimization-based methods, the sum of the weights needs to be constrained in order to identify them. However, what that constraint is set to is arbitrary and, in fact, can differ across R packages. The way I implemented entropy balancing requires that the weights in each treatment group sum to the size of the treatment group. This, again, has nothing to do with the properties of the weights, but is an arbitrary factor. It has some intuitive value in that the weights have an average of 1, so units with weights close to 1 are largely unchanged, whereas units with weights far from 1 are highly influential, and this is true regardless of the size of the original sample.

For weights estimated using propensity scores, there are formulas that relate the propensity score to the weight, so there is no need to scale the weights to have a specific sum, and I don't in WeightIt. If you did scale the weights by multiplying all the weights by a single constant, this would not change any of the properties of the weights, including balance, ESS, or treatment effect estimate.

I mentioned earlier that the scaling of the weights can affect treatment effect estimates when the outcome model includes covariates. To prevent the dependence on this arbitrary quality of the weights, one should always include all treatment-covariate interactions in the outcome model (after centering the covariates) if one is to include covariates in the outcome model.

JoshSchramm94 commented 2 years ago

Thank you very much for your detailed response. This helped a lot to better understand it. I assume the robust standard errors are achieved by using the survey package, as mentioned in your vignette.

As not all covariates are balanced and the variability of the weights is also quite high when using propensity score, I planned to use entropy balancing. However, I was just a little bit surprised that the effect even got stronger when using entropy balancing or propensity score.

Once again, thank you very much and I appreciate your help.