bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
287 stars 91 forks source link

Extract confidence intervals (or table) from att_gt or aggte #170

Open rcragun opened 1 year ago

rcragun commented 1 year ago

Problem

I do not see any way to extract the confidence intervals/bands from objects created by att_gt or aggte. More generally, it would help to be able to extract the entire table of ATTs, SEs, and confidence interval limits.

For example, I might get the following as part of the output of aggte():

Overall summary of ATT's based on event-study/dynamic aggregation:  
       ATT    Std. Error     [ 95%  Conf. Int.]
 -4116.697      518.2209  -5132.392   -3101.003 *

Dynamic Effects:
 Event time  Estimate Std. Error [95% Pointwise  Conf. Band] 
          1 -4050.087  2172.0486       -8307.224    207.0497  
          2 -6640.275  2571.5133      -11680.348  -1600.2014 *
          3 -3806.172   964.4840       -5696.526  -1915.8178 *
          4 -2775.943   851.4626       -4444.779  -1107.1069 *
          5 -1471.990   585.7501       -2620.039   -323.9410 *

It would be helpful to be able to extract these two matrices from the AGGTEobj object:

Matrix 1

       ATT    Std. Error     CI95LB      CI95UB
 -4116.697      518.2209  -5132.392   -3101.003

Matrix 2

 Event time  Estimate Std. Error          CI95LB      CI95UB
          1 -4050.087  2172.0486       -8307.224    207.0497  
          2 -6640.275  2571.5133      -11680.348  -1600.2014
          3 -3806.172   964.4840       -5696.526  -1915.8178
          4 -2775.943   851.4626       -4444.779  -1107.1069
          5 -1471.990   585.7501       -2620.039   -323.9410

I made up the names for the confidence interval columns.

Naming

I think the behavior should be similar to lm objects. coef(summary(lmobject)) (or summary(lmobject)$coefficients) returns a matrix like the larger table above.

The matrices above should not be called “coefficients”, though. Perhaps att_table and att_table_overall would be good names.

Use case

I have a list of AGGTEobj objects and want to show confidence intervals for all of the ATTs in the same figure.

rcragun commented 1 year ago

It seems that MP and AGGTEobj objects have a tidy() method. This was not easy to find, so it might still help to have a component that includes the matrix (as in MPobject$att_table). This would also help for people who do not use tidyverse.

bcallaway11 commented 1 year ago

Yes, I agree with you. I'm going to mark this as an enhancement that we should work on.

Thank you!

rcragun commented 1 year ago

Note that even with the tidy method, it still does not seem possible to easily extract the overall confidence intervals from AGGTEobj objects, so that might be the highest priority.

pedrohcgs commented 1 year ago

Hi, Thanks for the flag. It would be great if you could propose the improvements and make a pull request.

--

Pedro H. C. Sant'Anna Department of Economics Vanderbilt University 615-875-8448 (phone) @.*** https://email.vanderbilt.edu/owa/redir.aspx?SURL=R33SAibMcqASxYI7tVlx3_2Xx09NkN6m1ZKVuyK8lL42lUjfQp_SCG0AYQBpAGwAdABvADoAcABlAGQAcgBvAC4AaAAuAHMAYQBuAHQAYQBuAG4AYQBAAHYAYQBuAGQAZQByAGIAaQBsAHQALgBlAGQAdQA.&URL=mailto%3apedro.h.santanna%40vanderbilt.edu https://pedrohcgs.github.io

valentinyverse commented 1 year ago

Hi,

I stumbled over the same issue and saw that with the tidy() function in R you can technically pull out estimates, standard errors and confidence intervals. The problem is though, that tidy() reports different conf.high and conf.low for the same object, than what the object itself reports when printed in the console. I believe it is due to difference in calculations for the confidence interval in the did package.

See my question on stackoverflow to the same issue and the response I got: https://stackoverflow.com/questions/76056729/why-is-the-confidence-interval-different-when-using-tidy-on-output-of-aggte/76064957#76064957