NSAPH-Software / CausalGPS

Matching on generalized propensity scores with continuous exposures
https://NSAPH-Software.github.io/CausalGPS/
24 stars 5 forks source link

enable trimming by GPS in addition to or instead of trimming by exposure #202

Closed m-qin closed 1 year ago

m-qin commented 1 year ago

This is something I'm trying out with my ADRD project with Dan and Naeem. Trimming GPS is more justified in the literature than trimming exposure (even though they may be correlated sometimes, they are not the same). I've written my own code to do this, but it would be cool to see it as an option in CausalGPS in the future.

Edit: Liu et al. (2019)'s meta-analysis of PM2.5's and PM10's association with mortality trimmed exposure at the 5th and 95th percentiles (though this is a very different study than a study using causal inference with a continuous treatment, which is what we're discussing here). So I guess trimming exposure can be justified as limiting the inferential population (causal or associational), as Dan said in the BaseCamp thread on the CausalGPS package. I suppose trimming GPS directly targets the positivity assumption in causal inference, though limiting the inferential population by trimming exposure is likely also motivated by positivity concerns (i.e., concern over whether observations with extreme exposure or GPS are comparable to each other).

Naeemkh commented 1 year ago

@wxwx1993, do you have any comments to add to this issue?

daniellebraun commented 1 year ago

i agree with @m-qin

wxwx1993 commented 1 year ago

I agree.

Naeemkh commented 1 year ago

Should we keep both of them? Trimming exposure level can be considered for positivity assumption. But as @m-qin, mentioned trimming GPS is more justified. Another option is leaving trimming exposure levels to the user and trim data based on GPS values.

wxwx1993 commented 1 year ago

Keep both. For now, our algorithm indeed allows untriming of exposure, right?

Naeemkh commented 1 year ago

Yes. If we pass (0, 1) for trimming quantiles, it uses the entire data.