matheusfacure / python-causality-handbook

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
https://matheusfacure.github.io/python-causality-handbook/landing-page.html
MIT License
2.74k stars 475 forks source link

Chapter 21-T learner propensity score estimation #218

Closed erichan90 closed 2 years ago

erichan90 commented 2 years ago

There is an issue in chapter 21, in the following paragraph

fit the propensity score model

ps_m = LogisticRegression(solver="lbfgs", penalty='none') ps_m.fit(train[X], train[y]) # This fits the logistic model for the outcome ps_score = ps_m.predict_proba(train[X])

Should It be

fit the propensity score model

ps_m = LogisticRegression(solver="lbfgs", penalty='none') ps_m.fit(train[X], train[T]) # This fits the logistic model for the treatment ps_score = ps_m.predict_proba(train[X])

matheusfacure commented 2 years ago

Nice catch! I fixed it be the result was terrible. Turns out there is a huge positivity issue in the dataset, which yields massive sample weights. I tried trimming it to 0.05, but no success. In the end, I just remove the PS entirely.