DoWhy Logistic Regression with Stats Api

py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

MIT License

7.01k stars 923 forks source link

Dear authors,

I am using dowhy for a project, and it is a GREAT tool!

Basically, I was comparing the results obtained with the method backdoor with logistic regression using stats api as suggeted by you with a method created from scratch using scikit-learn. The results were very different, and mine seemed to be the more plausible. Moreover, the result should be the same as the S-Learner with LR, If I am not mistaken. Mine was equal, while using stats api very different.

I think there could be an issue with the GLM methods: when you call .predict with GLM from stats, you do not obtain the prediction (i.e., 0 - 1) but you obtain the probability. While in scikit-learn you obtained directly the class prediction:

model_sklearn.predict_proba(X)[:,1] == model_statsmodel.predict(X)
model_sklearn.predict(X) == (model_statsmodel.predict(X)>0.5).astype(int)

So, is it true that you're actually using .predict returning the probabilites? In this case, why are you taking the probabilities for computing the ATE instead of the class prediction?

Thank you very much in advance!

For most cases, probabilities are the correct output to use for computing the causal effect on a binary output. The expression is, E[Y|do(T=1] - E[Y|do(T=0] = P[Y=1|do(T=1] - P[Y=1|do(T=0] so it makes sense to use the probabilities.

To see an extreme example, consider that T and Y are both binary and there are no confounders. The true generating equation for Y is, y=Bernoulli(sigmoid(t*beta + N(0,0.01)) and beta is 0. So the causal effect of T on Y is zero.

Using logistic regression and the score/probability as the output, estimated P(Y=1|T=1) and P(Y=1|T=0) will be nearly the same and causal estimate will be zero.
Using the 0/1 class as output, the causal estimate can be 1 which is incorrect. This would happen whenever one of estimated P(Y=1|T=1) and P(Y=1|T=0) is less than 0.5 and the other is more than 0.5. For example, all inputs with T=1 will be predicted as 1, and all inputs with T=0 will be predicted zero.

Still, it can be useful to add flexibility to directly output the class prediction, e.g., for comparison with a default logistic metalearner. I've added an PR #386 for adding an argument predict_score to the GLM estimator. This can be specified in method_params of estimate_effect. It is True by default.

py-why / dowhy

DoWhy Logistic Regression with Stats Api #296