forestry-labs / causalToolbox

GNU General Public License v3.0
35 stars 8 forks source link

How to examine propensity scores from X-Learner #6

Open carlyls opened 2 years ago

carlyls commented 2 years ago

I really appreciate this package and your work on these meta-learners for treatment effect heterogeneity. I wanted to look more into the propensity scores estimated during the X-Learner with a random forest. I know the scores are used as a weight in the final CATE combination, but I wanted to see their distribution as well. I also was wondering if I could pre-specify propensity scores through the X_RF function, or if the estimation process is always a part of the method. Thank you!

ee-jackson commented 6 months ago

@theo-s @soerenkuenzel

I (& my colleague) have a similar query, so adding on to this issue.

As we understand it, for the X-Learner, it is possible to use a different set of covariates for predicting the propensity score (using a random forest) and for predicting the outcome variable, and thus the CATE. Within the X_RF() function, we can specify the covariates for the propensity score model using the e.forestry argument to specify the relevant.Variables.

But then when we go on to use the EtimateCate() function, we get the following error:

Error in testing_data_checker(object, newdata, object@hasNas) : newdata has 15 but the forest was trained with 4 columns.

How might you suggest that we overcome this? Thanks for all your work on this package, we would be very grateful for your response.