ebenmichael / augsynth

Augmented Synthetic Control Method
MIT License
148 stars 52 forks source link

Confidence Intervals are asymmetric and capped to zero. #60

Closed davidnathanlang closed 2 years ago

davidnathanlang commented 3 years ago
library(tidyverse)
library(augsynth)

dat <- read_rds(url("https://github.com/davidnathanlang/augsynth_issue/blob/main/asymetric_confidence_interval_reprex.rds?raw=true")) 
asynth <- augsynth::augsynth(
  outcome ~ treat,
  unit = state,
  time = centered_week,
  data = dat ,
  progfunc = "None", fixedeff = FALSE)
#> One outcome and one treatment time found. Running single_augsynth.

(sum<-summary(asynth)) 
#> 
#> Call:
#> single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
#>     t_int = t_int, data = data, progfunc = "None", fixedeff = FALSE)
#> 
#> Average ATT Estimate (p Value for Joint Null):  0.526   ( 0.954 )
#> L2 Imbalance: 0.424
#> Percent improvement from uniform weights: 93.9%
#> 
#> Avg Estimated Bias: NA
#> 
#> Inference type: Conformal inference
#> 
#>  Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
#>     0   -0.139             -2.743              2.464   0.500
#>     1   -0.427             -3.031              2.177   0.278
#>     2   -0.674             -3.278              1.930   0.389
#>     3   -0.686             -3.290              1.918   0.444
#>     4   -0.585             -3.189              2.019   0.611
#>     5   -0.573             -3.177              2.031   0.778
#>     6   -0.434             -3.038              2.170   0.833
#>     7    0.100             -2.504              2.704   0.944
#>     8    0.324             -2.280              2.927   0.500
#>     9    0.505             -2.099              3.109   0.333
#>    10    1.394             -1.210              3.998   0.111
#>    11    1.803             -0.801              4.407   0.167
#>    12    2.124             -0.480              4.728   0.167
#>    13    2.402             -0.202              5.006   0.167
#>    14    2.756              0.000              5.360   0.222  # Assymetric Confidence Interval
sum$att %>%
  filter(Time >= 0) %>%
  mutate(left_interval = Estimate - lower_bound, right_interval = upper_bound-Estimate) %>%
  filter(left_interval-right_interval>0.00001)
#>    Time Estimate lower_bound upper_bound     p_val left_interval right_interval
#> 14   14 2.756339           0    5.360227 0.2222222      2.756339       2.603888

Created on 2021-09-21 by the reprex package (v2.0.0)

ebenmichael commented 3 years ago

Hi, this is a consequence of the grid_size optional parameter for the conformal inference routine. The confidence intervals work by looking at all effect sizes that we can't reject, and grid_size decides what size grid to search over when computing the confidence intervals, and so will determine how fine-grained the interval is.

No matter the grid size, though it's hard-coded to look to see if an effect of zero can be rejected. So in this case, I think what's happening is that 0 can't be rejected, but the next lowest effect in the grid can be rejected, and so the lower bound is exactly zero. Try increasing grid_size and see what happens.

davidnathanlang commented 3 years ago
(sum<-summary(asynth,grid_size=50000)) 

Call:
single_augsynth(form = form, unit = !!enquo(unit), time = !!enquo(time), 
    t_int = t_int, data = data, progfunc = "None", fixedeff = FALSE)

Average ATT Estimate (p Value for Joint Null):  0.526   ( 0.951 )
L2 Imbalance: 0.424
Percent improvement from uniform weights: 93.9%

Avg Estimated Bias: NA

Inference type: Conformal inference

 Time Estimate 95% CI Lower Bound 95% CI Upper Bound p Value
    0   -0.139             -2.743              2.464   0.500
    1   -0.427             -3.031              2.177   0.278
    2   -0.674             -3.278              1.930   0.389
    3   -0.686             -3.290              1.918   0.444
    4   -0.585             -3.189              2.019   0.611
    5   -0.573             -3.177              2.031   0.778
    6   -0.434             -3.038              2.170   0.833
    7    0.100             -2.504              2.704   0.944
    8    0.324             -2.280              2.927   0.500
    9    0.505             -2.099              3.109   0.333
   10    1.394             -1.210              3.998   0.111
   11    1.803             -0.801              4.407   0.167
   12    2.124             -0.480              4.728   0.167
   13    2.402             -0.202              5.006   0.167
   14    2.756              0.000              5.360   0.222

Did not appear to make a difference, even when I updated the grid size to 50,000.
Thanks for the suggestion though.

ebenmichael commented 3 years ago

Hmm all of the confidence interval bounds are the same, so it seems like that optional parameter isn't actually exposed. I'll look into this. Regardless, the p-value on the last time period is .222 so 0 can't be rejected. If it could be, then it wouldn't be part of the confidence interval.