DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
I am using DrOrthoForest to analyze CATE for different populations.
Since DrOrthoForest does not support string categorical variable. I am turning them into integers to use as categorical variable.
# DR OrthoForest
Y = np.ravel(df[["target_y"]])
T = np.ravel(df[["treatment"]])
W = df[["income","month"]]
X = df[["sex", "age_group"]]
est = DROrthoForest(n_trees=100, max_depth=5, subsample_ratio=1,
propensity_model=GradientBoostingClassifier(),
model_Y=GradientBoostingRegressor())
est.fit(Y,T,X=X,W=W)
X_test = np.array(list(itertools.product([0,1], range(10))))
X_test.shape
infer = est.effect_inference(X=X_test)
I want to find CATE for each sex-age_group combination, say that age group is 10. So I am testing with [male(0), 10s(1)], [male(0), 20s(2)] ... [female(1), 50s(4)]. However, I noticed that the inference on excess combination also worked albeit with not so statistically significant result. (eg. [0, 6], [1, 10]) If X was set in the beginning, shouldn't inference only be available within the scope of input combinations? Or am I doing something wrong?
I am using DrOrthoForest to analyze CATE for different populations. Since DrOrthoForest does not support string categorical variable. I am turning them into integers to use as categorical variable.
I want to find CATE for each sex-age_group combination, say that age group is 10. So I am testing with [male(0), 10s(1)], [male(0), 20s(2)] ... [female(1), 50s(4)]. However, I noticed that the inference on excess combination also worked albeit with not so statistically significant result. (eg. [0, 6], [1, 10]) If X was set in the beginning, shouldn't inference only be available within the scope of input combinations? Or am I doing something wrong?