ebenmichael / augsynth

Augmented Synthetic Control Method
MIT License
147 stars 52 forks source link

Covariates and Sort Order produce distinct treatment estimates #55

Closed davidnathanlang closed 3 years ago

davidnathanlang commented 3 years ago

I am getting distinct ATT estimates for synthetic control depending on sort order of unit names. I wanted to verify that this is still expected behavior. I thought this may have been an issue with cross-validation in ridge regression. I am still getting distinct results on vanilla synthetic control. See reprex below (Thanks @williamlief for the example).

library(augsynth)
library(rlang)

syn_wrap <- function(unit) {
  augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
             log(revlocalcapita) + log(avgwklywagecapita) +
             estabscapita + emplvlcapita,
           {{unit}}, year_qtr, kansas,
           progfunc = "none", scm = T,fixedeff=F)

} 

set.seed(123)

syn_fips <- syn_wrap(fips)
syn_abb <- syn_wrap(abb) 
syn_state <- syn_wrap(state)

# syn_fips and syn_state are equal, syn_abb is different
syn_fips # -0.032
syn_abb # -0.005 (SIC)
syn_state # -0.0032

# Visualize estimates
library(patchwork)
library(ggplot2)

p1 <- plot(syn_fips) + labs(title = "Unit = fips")
p2 <- plot(syn_abb) + labs(title = "Unit = abb")
p3 <- plot(syn_state) + labs(title = "Unit = state")

# Checking source data 
distinct_kansas <- kansas %>% select(fips, abb, state) %>% 
  distinct()

# Its possible state sort order matters? fips and state sort the same way
# but abb sorts differently. 
wrap_plots(p1, p2, p3, nrow = 3) 
davidnathanlang commented 3 years ago

For Visual Reference image

williamlief commented 3 years ago

When we do not include covariates in the specification we do get identical results across unit variables.

library(augsynth)
library(rlang)

syn_wrap_nocov <- function(unit) {
  augsynth(lngdpcapita ~ treated,
           {{unit}}, year_qtr, kansas,
           progfunc = "none", scm = T,fixedeff=F)
} 

syn_wrap_nocov(fips) # -0.029
syn_wrap_nocov(abb)  # -0.029
syn_wrap_nocov(state) # -0.029
ebenmichael commented 3 years ago

Thanks for catching this! It was indeed a sorting issue with covariates. It should be fixed now.