py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.72k stars 703 forks source link

Propensity score using generalized random forests #463

Open klintmane1 opened 3 years ago

klintmane1 commented 3 years ago

Hi!

This is more of a question rather than an issue. How would I use the generalized random forests to obtain propensity scores for my sample? Any examples or references?

vsyrgkanis commented 3 years ago

Unfortunately, we've only implemented a regression forest in our grf module and not a classification forest. So currently there is no way to use a grf classifier for the propensity score.

However, many other forest classifiers would most prob perform equally well if not better in practice than a grf classifier. For instance, using xgboost or lightgbm or an sklearn RandomForestClassifier, or GradientBoostingClassifier, with hyperparameter tuning using cross valdiation would be equally good, potentially better.

See for instance our notebook here on how to do this: https://github.com/microsoft/EconML/blob/master/notebooks/ForestLearners%20Basic%20Example.ipynb