ngreifer / WeightIt

WeightIt: an R package for propensity score weighting
https://ngreifer.github.io/WeightIt/
102 stars 12 forks source link

Back to those multinomial propensity scores... (checking overlap/positivity assumption?) #13

Closed hgerlovin closed 3 years ago

hgerlovin commented 4 years ago

3 - Hi @ngreifer - I understand your hesitation to add this (ps object) to the output. Just wondering if you might have some suggestions about best practices for checking "overlap" with your package? The twang package approach is to use the ps object, despite these individual scores not summing to 1.

Also, I'm sure you've already looked at the GBM objects, but I haven't been able to find a reference for all the elements that I see in the output "obj". GBM documentation only describes some of the values, "estimator" is not described, but it looks promising -- not much time to dig down the rabbit-hole.

Thanks so much! Hanna

Hanna Gerlovin, PhD Biostatistician Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC) Veterans Affairs, Boston hanna.gerlovin@va.gov

ngreifer commented 4 years ago

Hi Hanna,

I don't think you need to check overlap after you've estimated weights. As I understand it, the point of checking overlap is to make sure that it's possible to achieve balance on covariates. If there is a lack of overlap, then balance may not be achievable, or if it is, it is with a low-precision effect estimate. If you have achieved balance, then you are not extrapolating. Good weighting methods will downweight units outside the range of common support unless those units are required to achieve balance, in which case they should not be discarded. Balance doesn't just mean on the means, though, and distributions lacking overlap will display high imbalance in some measure. Because of this balance-based focus, I don't think it's at all necessary to check for overlap on the propensity score. This especially true given that the propensity score is an arbitrary combination of the covariates design to yield balance, not to explain selection into treatment or represent the true selection probabilities. I disagree with authors who take the estimated propensity scores too seriously and use them for diagnostics when the only relevant diagnostics are covariate balance (measured broadly) and effective sample size. Balance or overlap on the propensity score are incidental and don't tell you much about the quality of your effect estimate.

Basically, I think the best practice is not to check overlap on the propensity score, so I'm not inclined to add additional tools to facilitate it.

One thing you could do is get the multinomial propensity scores from the "obj" component containing the GBM fit by using gbm::predict.gbm() on it. You would supply the number of trees that corresponded to the best tree, which is in the "info" component of the weightit output object. You could then do what you want with those, but it's not even clear to me what you would do with them. For each category, you could see whether the propensity for that category is evenly distributed across all groups. You could use bal.plot() in cobalt to do that, supplying the predicted probability of that category as the variable on which to display balance.

To get the predicted porbabilities, you need to call weightit() with keep.data = TRUE as an argument (this is supplied to gbm.fit() so it's not documented in weightit()). From there, would compute the propensity scores as:

ps <- drop(predict(W$obj, n.trees = W$info$best.tree, type = "response"))

This would be a matrix with one column per treatment level and one row per unit, with each row giving the predicted probability of that unit being in that treatment level. You could then supply that to bal.plot():

bal.plot(treat ~ ps, data = data, var.name = "A")`

where "A" would be replaced by each level of the treatment.

About the gbm documentation, it is indeed lacking. The project was abandoned so there aren't any updates to it and a lot was left incomplete. In fact, they actually recommend against using it for multiclass prediction (i.e., to estimate multinomial propensity scores). I'm not sure why this is but it seems like the algorithm might be unstable. That stuff is a bit beyond me, though. Until I face problems with it in WeightIt I'll probably retain it, though I actually have been looking for substitutions. Because the multinomial capability is underdeveloped in gbm, they probably didn't bother to extensively write the documentation that covers it. I took a look at the code inside gbm.fit() that generates the "estimator" output. Each value is equal to

\frac{e^{\hat p(A_i=g)}}{\sum_{j=1}^N e^{\hat p(A_j = g)}}

I have no idea what this means though, and I would ignore it. It has nothing to do with estimating treatment effects. It's just a transformation of the predicted probabilities for the final tree.

Noah