ebenmichael / augsynth

Augmented Synthetic Control Method
MIT License
143 stars 52 forks source link

Confirm Jackknife+ is working as intended? #59

Closed davidnathanlang closed 2 years ago

davidnathanlang commented 3 years ago

I wanted to make sure jackknife+ is working as intended? have an example where I was able to generate effect estimates that were outside the associated 95% confidence interval for a given week, specifically weeks 7 and 8 below. I understand that jackknife+ can construct asymmetric confidence intervals but it seems bizarre to have an estimate that doesn't fit inside it's own confidence intervals. Is this the intended behavior? @williamlief

library(tidyverse)
library(augsynth)
library(patchwork)
dat<-read_rds(url("https://github.com/davidnathanlang/augsynth_issue/raw/main/simple_dat.rds"))
asynth <- augsynth::augsynth(
  outcome ~ treat,
  unit = state,
  time = centered_week,
  data = dat  ,
  progfunc = "Ridge", fixedeff = TRUE)
#> One outcome and one treatment time found. Running single_augsynth.

conformal<-plot(asynth,inf_type="conformal") +labs(title="Conformal Inference")
jackknife_plus<-plot(asynth,inf_type="jackknife+") +labs(title="Jackknife+ Inference")

conformal+jackknife_plus # Huge difference in Standard Errors


(sum_model<-summary(asynth,inf_type = "jackknife+"))
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "Ridge", fixedeff = TRUE)
#> 
#> Average ATT Estimate:  7.527 
#> L2 Imbalance: 0.000
#> Percent improvement from uniform weights: 100%
#> 
#> Avg Estimated Bias: -10.045
#> 
#> Inference type: Jackknife+ over time periods
#> 
#>  Time Estimate 95% CI Lower Bound 95% CI Upper Bound
#>     0    2.003             -2.947              2.284
#>     1    3.776             -4.061              3.927
#>     2    5.736             -4.147              5.897
#>     3    6.424             -4.500              6.491
#>     4    7.851             -4.502              7.932
#>     5    9.160             -4.505              9.188
#>     6    8.977             -4.769              9.069
#>     7   10.772             -4.825             10.594
#>     8    9.316             -5.306              9.153
#>     9   10.291             -5.067             10.387
#>    10    8.962             -5.535              9.165
#>    11    8.838             -5.691              9.072
#>    12    8.566             -5.828              8.792
#>    13    8.106             -5.959              8.352
#>    14    7.245             -6.327              7.612
#>    15    6.946             -6.513              7.398
#>    16    6.478             -6.890              6.982
#>    17    6.040             -7.167              6.614
sum_model$att %>% as_tibble() %>% filter(Estimate>upper_bound) # Weeks 7 and 8 Have point estimates that are outside of their confidence interval
#> # A tibble: 2 x 4
#>    Time Estimate lower_bound upper_bound
#>   <dbl>    <dbl>       <dbl>       <dbl>
#> 1     7    10.8        -4.82       10.6 
#> 2     8     9.32       -5.31        9.15

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19043)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] patchwork_1.1.1 augsynth_0.2.0  forcats_0.5.1   stringr_1.4.0  
#>  [5] dplyr_1.0.7     purrr_0.3.4     readr_1.4.0     tidyr_1.1.3    
#>  [9] tibble_3.1.2    ggplot2_3.3.5   tidyverse_1.3.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.7        lubridate_1.7.10  lattice_0.20-44   assertthat_0.2.1 
#>  [5] digest_0.6.27     utf8_1.2.1        R6_2.5.0          cellranger_1.1.0 
#>  [9] backports_1.2.1   reprex_2.0.0      evaluate_0.14     httr_1.4.2       
#> [13] highr_0.9         pillar_1.6.2      rlang_0.4.11      readxl_1.3.1     
#> [17] rstudioapi_0.13   Matrix_1.3-3      rmarkdown_2.9     styler_1.5.1     
#> [21] labeling_0.4.2    osqp_0.6.0.3      munsell_0.5.0     broom_0.7.9      
#> [25] compiler_4.1.0    modelr_0.1.8      xfun_0.24         pkgconfig_2.0.3  
#> [29] htmltools_0.5.1.1 tidyselect_1.1.1  fansi_0.5.0       crayon_1.4.1     
#> [33] dbplyr_2.1.1      withr_2.4.2       grid_4.1.0        jsonlite_1.7.2   
#> [37] gtable_0.3.0      lifecycle_1.0.0   DBI_1.1.1         magrittr_2.0.1   
#> [41] scales_1.1.1      cli_3.0.1         stringi_1.6.2     farver_2.1.0     
#> [45] fs_1.5.0          xml2_1.3.2        ellipsis_0.3.2    generics_0.1.0   
#> [49] vctrs_0.3.8       Formula_1.2-4     tools_4.1.0       glue_1.4.2       
#> [53] hms_1.1.0         yaml_2.2.1        colorspace_2.0-2  rvest_1.0.0      
#> [57] LowRankQP_1.0.4   knitr_1.33        haven_2.4.1

Created on 2021-09-17 by the reprex package (v2.0.0)

ebenmichael commented 2 years ago

This certainly is pretty strange behavior, and I can't tell if it's a bug or just an odd edge case. I'll dig into this, but I think the conformal inference procedure works better and is more stable, and this implementation of the jackknife+ might end up being dropped.