uber / causalml

Uplift modeling and causal inference with machine learning algorithms
Other
5.02k stars 772 forks source link

Adding Interaction Tree and Conditional Interaction Tree Criterion #530

Open jroessler opened 2 years ago

jroessler commented 2 years ago

Is your feature request related to a problem? Please describe. The Interaction Tree (IT) (Su et al. 2009) and Conditional Interaction Tree (CIT) (Su et al. 2012) have been proposed a couple of years ago in the heterogeneous treatment effect literature. Yet, I'm not aware of any python implementation. Both methods follow the CART (Breiman et al. 1984) convention to construct a tree. The latter method - CIT - has also been proposed with random forests.

Describe the solution you'd like Implement both algorithms. Both methods follow the following convention: (1) growing a large initial tree; (2) a pruning algorithm; and (3) a validation method for determining the best tree size. Note that, while your framework offers an easy way to implement (1), it will be tough to implement (2) and (3) as the pruning techniques for IT and CIT are totally different than the pruning techniques provided by causalml.

Describe alternatives you've considered I'm wondering whether we really need the implementation of (2) and (3) as causalml provides other ways of pruning (i.e., setting hyperparameters such as max_depth, n_estimators etc. and using causalml's pruning method). I can easily implement two new evaluation functions for IT and CIT, respectively. While causalml couldn't claim to have implemented IT and CIT 100% correctly, you could at least offer two more splitting criteria based on the ideas of IT and CIT. Let me know what you think.

Additional context Su, X., Tsai, C. L., Wang, H., Nickerson, D. M., & Li, B. (2009). Subgroup analysis via recursive partitioning. Journal of Machine Learning Research, 10(2).

Su, X., Kang, J., Fan, J., Levine, R. A., & Yan, X. (2012). Facilitating score and causal inference trees for large observational studies. Journal of Machine Learning Research, 13, 2955.

jeongyoonlee commented 1 year ago

I am personally not familiar with IT or CIT, but in principle, I'd be supportive of adding them to causalml as evaluation functions. We can document that the implementation is not exact for reference. If you can contribute, it'd be great.