grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
952 stars 247 forks source link

multi-arm causal forest: different X variables for different treatments (feature selection) #1296

Open robertmilo737 opened 1 year ago

robertmilo737 commented 1 year ago

Hi everyone, I am trying to run a causal forest on an experiment with 10 different treatments. If i were running 9 separate causal forest, i would implement feature selection on every forest by running each forest twice, first with all available features and then only with those with high variable importance. In general, i would not expect the important features of that predict treatment effects for each treatment to be the same.

However, when it comes to the multi-arm causal forest, i don't see how i could feed different features for different treatments. Is is possible to do so?

If not, would it be better to have 9 separate causal forests? What would be the disadvantage of doing so?

Thanks.

Robert.

erikcs commented 1 year ago

Hi Robert, the idea behind MCF is to deliver an efficiency gain in estimates in the setting where you think it’s reasonable to believe that the treatment effects across arms (or, potentially, outcomes, in which case you could use it with matrix Y) are correlated, thus, the same X’s are used for every arm. If this is not the case (which is up to the analyst’s judgment), then it’s perfectly reasonable to fit multiple causal forests. (There could also be just a purely computational gain from fitting a single MCF vs fitting several CFs)