py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.77k stars 713 forks source link

Binary treatment and Continuous outcome #780

Open dori21 opened 1 year ago

dori21 commented 1 year ago

In the case of Binary treatment[1 for treatment group 0 for control group] and Continuous outcome,

CASE1 : discrete_treatment=True

est = CausalForestDML(criterion='het')

set parameters for causal forest

est = CausalForestDML(criterion='het', random_state=1, discrete_treatment=True, honest=True, inference=True, cv=2, model_t=LogisticRegressionCV(), model_y=Lasso() )

CASE2 : discrete_treatment=False est = CausalForestDML(criterion='het', random_state=1, discrete_treatment=False, honest=True, inference=True, cv=2, model_t=Lasso(), model_y=Lasso() )

CASE1 and CASE 2 basically work same function ? In this case, which one is more fir between CASE 1 or CASE 2 ?

I wonder discrete_treatment=TRUE is applies for only multiple treatment not binary treatment.

kbattocchi commented 1 year ago

You should set discrete_treatment=True in this case, though it may not matter much in practice. When discrete treatment is specified, we one-hot-encode the treatment and then drop the first column, which doesn't matter in this case because your treatment values are 0 and 1 (but would matter if they were 'a' and 'b' or something). We also call the predict_proba method on the T model that you specify, rather than predict when computing our first stage residuals - this should generally result in slightly better final stage models because otherwise the treatment residuals will be limited to the discrete set {-1,0,1} (depending on if T-T_pred is 0-1; 1-1 or 0-1; or 1-0), rather than using the finer grained probabilities the classifier learned (which can result in residuals in the entire range [-1,1]).