ForestDRLearner : outcome binary and treatement is discret ( 3 values)

py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.

Other

3.82k stars 715 forks source link

i'm building model with ForestDRLearner . I would to have the treatment which minimizes the outcome and in the end to have client, best_treatment 1, 0 2, 1 3, 2 4, 0 ect ...

how make this final dataset with this code ? what is the best solution ? this code is not quite what I need

X = sampling.drop(columns=['T', 'Y']) Y = sampling['Y'] T = sampling['T']

X_train, X_test, T_train, T_test, Y_train, Y_test = train_test_split(X, T, Y, test_size=0.2, random_state=123)

model = ForestDRLearner( model_propensity=XGBClassifier(learning_rate=0.1, max_depth=3, objective="multi:softprob"), model_regression=XGBClassifier(learning_rate=0.1, max_depth=3, objective="binary:logistic"), discrete_outcome=True, random_state=1, )

model.fit(Y=Y_train, T=T_train, X=X_train, inference="auto")

cate_estimates = model.effect(X_test) cate_estimates

best_treatment = np.argmin(cate_estimates, axis=1)

results = pd.DataFrame({

'best_treatment': best_treatment

})

thank you for your answer, I discovered econml a short time ago, and I am not yet very expert. my problem is that I am not sure of the code that I have to write to answer my problem, I am open to other proposals. In the meantime, I told myself that ForestDRLearner was a good solution to my problem. use. I have my binary outcome and my processing is discrete (it takes 3 values). using the cate, I would like to find what is the best treatment for each client. I started with this code. Maybe this is not the right way to do it? my question is : how to know what is the best treatment?
client | best_treatement client1 | 2 client2 | 1 client3 | 0 client4 | 2 etc.. i add this code cate_estimates_with_control = np.hstack([np.zeros((cate_estimates.shape[0], 1)), cate_estimates])
I don't know if this matches your suggestion.

py-why / EconML

ForestDRLearner : outcome binary and treatement is discret ( 3 values) #908