matheusfacure / python-causality-handbook

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
https://matheusfacure.github.io/python-causality-handbook/landing-page.html
MIT License
2.65k stars 463 forks source link

24 - The Difference-in-Differences Saga - Confidence intervals and p-values #222

Open dbalabka opened 2 years ago

dbalabka commented 2 years ago

Firstly, thanks for a great book! It gives every intuitive way to understand the Causal Inference topic!

It would be very useful to provide information on how to calculate statistical significance (p-values and confidence intervals) for multiple treat variables from the OLS model.

dbalabka commented 2 years ago

@matheusfacure I would like to clarify the problem. Using, the following OLS model:

formula = f"""installs ~ treat:C(cohort):C(date) + C(unit) + C(date)"""
...
twfe_model = smf.ols(formula, data=df_heter_str).fit()

We came up with multiple treatment coefficients for each cohort on a particular date (treat:C(cohort):C(date)). So, the question is whether there is a way to calculate a single aggregated treatment coefficient with p-values and confidence intervals? Unfortunately, it becomes not so straight forward as it is described in https://matheusfacure.github.io/python-causality-handbook/05-The-Unreasonable-Effectiveness-of-Linear-Regression.html

The initial idea was to perform multiple predictions using bootstrapped datasets and build a distribution of the true effect and predicted effect.

timothymeehan commented 2 years ago

@dbalabka @matheusfacure I came to suggest/ask about the exact same issue, so I'd like to second this one!

dbalabka commented 2 years ago

It might be useful. I've found another mention of the TWFE coefficients interpretation issue.

The coefficient that comes from the two-way fixed effects (TWFE) estimator when there are more than two units and periods is not an easily interperable parameter in the same manner. Numerous papers have now documented that this coefficient is in fact a weighted average of many different treatment effects, and that these weights are often negative and non-intuitive.

https://andrewcbaker.netlify.app/2019/09/25/difference-in-differences-methodology/#:~:text=The%20coefficient%20that,and%20non-intuitive.

dbalabka commented 2 years ago

Here is a paper that tries to address the TWFE coefficients interpretation issue: https://www.nber.org/system/files/working_papers/w29976/w29976.pdf

I might be wrong but it seems that TWFE coefficients interpretation is currently actively studied and there is no proper solution.

matheusfacure commented 2 years ago

Very interesting discussion. I don't know if I can commit to a fix, as the issue is actively being discussed. I just uploaded an appendix which leveraged conformal inference for time series models following what is done in this paper https://arxiv.org/abs/1712.09089 The appendix only discusses it in the case SC, but authors also apply it to DiD.