kylebutts / did2s

Two-stage Difference-in-Differences package following Gardner (2021)
http://kylebutts.github.io/did2s
Other
93 stars 21 forks source link

Gardner (2021) Fails (Github version) #31

Closed eohne closed 7 months ago

eohne commented 7 months ago

Hi Kyle

Thanks a lot for this package! Nice work.

I previously used the CRAN version and everything works fine there. I then installed the GitHub version and both the static model and the event study fail for the Gardner (2021) estimation:

library(did2s)
x <- event_study(
    data = df_het, yname = "dep_var", idname = "unit",
    tname = "year", gname = "g", estimator = "all"
)

Output:

Note these estimators rely on different underlying assumptions. See Table 2 of `https://arxiv.org/abs/2109.05913` for an overview.
Estimating TWFE Model
Estimating using Gardner (2021)
Error : j (the 2nd argument inside [...]) is a single symbol but column name 'all_vars' is not found. If you intended to select columns using a variable in calling scope, please try DT[, ..all_vars]. The .. prefix conveys one-level-up similar to a file system path.
Estimating using Callaway and Sant'Anna (2020)
Estimating using Sun and Abraham (2020)
Estimating using Borusyak, Jaravel, Spiess (2021)
Estimating using Roth and Sant'Anna (2021)
Warning message:
In event_study(data = df_het, yname = "dep_var", idname = "unit",  :
  Gardner (2021) Failed

Similarly, if I run the example code for the static did2s regression from the manual:

static <- did2s(df_hom,
yname = "dep_var", treatment = "treat", cluster_var = "state",
first_stage = ~ 0 | unit + year,
second_stage = ~ i(treat, ref=FALSE))

Output:

Running Two-stage Difference-in-Differences
 - first stage formula `~ 0 | unit + year`
 - second stage formula `~ i(treat, ref = FALSE)`
 - The indicator variable that denotes when treatment is on is `treat`
 - Standard errors will be clustered by `state`

Error: j (the 2nd argument inside [...]) is a single symbol but column name 'all_vars' is not found. If you intended to select columns using a variable in calling scope, please try DT[, ..all_vars]. The .. prefix conveys one-level-up similar to a file system path.
> 

Session Info:

R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] did2s_1.1.0   fixest_0.11.2

loaded via a namespace (and not attached):
 [1] sandwich_3.1-0      utf8_1.2.4          generics_0.1.3     
 [4] tidyr_1.3.1         rstatix_0.7.2       stringi_1.8.3      
 [7] dreamerr_1.4.0      lattice_0.21-9      magrittr_2.0.3     
[10] grid_4.3.2          Matrix_1.6-1.1      backports_1.4.1    
[13] Formula_1.2-5       purrr_1.0.2         fansi_1.0.6        
[16] scales_1.3.0        didimputation_0.3.0 numDeriv_2016.8-1.1
[19] abind_1.4-5         cli_3.6.2           rlang_1.1.3        
[22] BMisc_1.4.5         munsell_0.5.0       withr_3.0.0        
[25] DRDID_1.0.6         tools_4.3.2         ggsignif_0.6.4     
[28] dplyr_1.1.4         coop_0.6-3          colorspace_2.1-0   
[31] ggplot2_3.4.4       did_2.1.2           ggpubr_0.6.0       
[34] broom_1.0.5         vctrs_0.6.5         R6_2.5.1           
[37] zoo_1.8-12          lifecycle_1.0.4     stringr_1.5.1      
[40] car_3.1-2           MASS_7.3-60         trust_0.1-8        
[43] pkgconfig_2.0.3     pillar_1.9.0        gtable_0.3.4       
[46] data.table_1.14.99  glue_1.7.0          Rcpp_1.0.12        
[49] tibble_3.2.1        tidyselect_1.2.0    staggered_1.1      
[52] nlme_3.1-163        carData_3.0-5       compiler_4.3.2     
> 

I installed the dev version of data.table just before running:

install.packages("data.table", repos="https://Rdatatable.gitlab.io/data.table")

I will use the CRAN version for now. Could you in the meantime maybe give me a quick summary of the differences between the two - except that the GitHub version is meant to either run faster or on larger datasets?

kylebutts commented 7 months ago

Hi @eohne, it's a bit complicated why this fails. A data.table object causes problems in fixest:::prepare_df. There is a fix with https://github.com/lrberge/fixest/pull/435, but it's not merged yet.

So, a short-term fix is converting to a regular data.frame: as.data.frame(df_het). I also changed the structure of df_het/df_hom to be data.frame objects to avoid this issue with my own example code 🙃

eohne commented 7 months ago

Amazing! Thanks a lot also for the super fast reply!

lrberge commented 7 months ago

@kylebutts: sorry for the delay, the fix should be there soon now!!!