bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
288 stars 92 forks source link

Cannot simulate data as per intro vignette #120

Closed grantmcdermott closed 1 year ago

grantmcdermott commented 2 years ago

Hi guys.

Hopefully I'm not doing something dumb here. But I'm unable to follow along with the example from the intro vignette.

library(did)

# set seed so everything is reproducible
set.seed(1814)

# generate dataset with 4 time periods
time.periods <- 4

# add dynamic effects
te.e <- 1:time.periods

# generate data set with these parameters
# here, we dropped all units who are treated in time period 1 as they do not help us recover ATT(g,t)'s.
dta <- build_sim_dataset()
#> Error in build_sim_dataset(): argument "sp_list" is missing, with no default

# How many observations remained after dropping the ``always-treated'' units
nrow(dta)
#> Error in nrow(dta): object 'dta' not found
#This is what the data looks like
head(dta)
#> Error in head(dta): object 'dta' not found

Created on 2022-03-24 by the reprex package (v2.0.1)

The build_sim_dataset documentation is a little unclear to me. But I think the splist argument object is supposed to be defined prior to calling the function (through the secondary reset.sim() function?) It might just be the case that that you're missing the environment scope for user defined values, or things aren't getting passed through ... correctly.

Thanks.

PS It kind of works if I define a intermediate object using reset.sim. But I don't get the same values as the intro vignette (different row numbers for starters.)

library(did)
set.seed(1814)
time.periods <- 4
te.e <- 1:time.periods
sp = reset.sim(time.periods) ## GM: Added (not sure what to use te.e for)
dta <- build_sim_dataset(sp)

nrow(dta) ## Different to vignette (which has 27,952 rows)
#> [1] 15916

Created on 2022-03-24 by the reprex package (v2.0.1)

Session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblas_haswellp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] did_2.1.1

loaded via a namespace (and not attached):
 [1] styler_1.7.0      tidyselect_1.1.2  xfun_0.30         purrr_0.3.4       carData_3.0-5     colorspace_2.0-3 
 [7] vctrs_0.3.8       generics_0.1.2    htmltools_0.5.2   yaml_2.3.5        utf8_1.2.2        rlang_1.0.2      
[13] R.oo_1.24.0       ggpubr_0.4.0      pillar_1.7.0      glue_1.6.2        withr_2.5.0       DBI_1.1.2        
[19] R.utils_2.11.0    lifecycle_1.0.1   R.cache_0.15.0    munsell_0.5.0     ggsignif_0.6.3    gtable_0.3.0     
[25] R.methodsS3_1.8.1 evaluate_0.15     knitr_1.37        callr_3.7.0       fastmap_1.1.0     ps_1.6.0         
[31] parallel_4.1.2    fansi_1.0.2       highr_0.9         broom_0.7.12      Rcpp_1.0.8.3      clipr_0.8.0      
[37] backports_1.4.1   scales_1.1.1      BMisc_1.4.4       abind_1.4-5       fs_1.5.2          ggplot2_3.3.5    
[43] digest_0.6.29     processx_3.5.2    rstatix_0.7.0     dplyr_1.0.8       grid_4.1.2        cli_3.2.0        
[49] tools_4.1.2       magrittr_2.0.2    tibble_3.1.6      crayon_1.5.0      car_3.0-12        tidyr_1.2.0      
[55] pkgconfig_2.0.3   ellipsis_0.3.2    data.table_1.14.3 reprex_2.0.1      assertthat_0.2.1  rmarkdown_2.13   
[61] rstudioapi_0.13   R6_2.5.1          compiler_4.1.2 
bcallaway11 commented 2 years ago

Yeah, I updated those functions but forgot to update the corresponding documentation.

Those functions are mostly internal and used for testing, but I'll get this fixed, as soon as possible.

grantmcdermott commented 2 years ago

Thanks @bcallaway11.

bcallaway11 commented 2 years ago

https://bcallaway11.github.io/did/articles/did-basics.html is updated now and everything should work.

tra6sdc commented 1 year ago

I tried the updated code and got an error message:

> # set seed so everything is reproducible
> set.seed(1814)
> 
> # generate dataset with 4 time periods
> time.periods <- 4
> 
> # add dynamic effects
> sp$te.e <- 1:time.periods
Error in sp$te.e <- 1:time.periods : object 'sp' not found
> 
> # generate data set with these parameters
> # here, we dropped all units who are treated in time period 1 as they do not help us recover ATT(g,t)'s.
> dta <- build_sim_dataset(sp)
Error in build_sim_dataset(sp) : object 'sp' not found
bcallaway11 commented 1 year ago

Ah yes, there is still an issue (not sure how I has able to get this to run). I think that you need to add a line like:

sp <- reset.sim(time.periods=time.periods)

immediately after the fourth line.

I’m just on my IPad here at the moment, but I’ll confirm that works as soon as possible.

tra6sdc commented 1 year ago

It works fine for me now.