kylebutts / did2s

Two-stage Difference-in-Differences package following Gardner (2021)
http://kylebutts.github.io/did2s
Other
96 stars 22 forks source link

connecting did2s to honest did #22

Closed Randol340 closed 1 year ago

Randol340 commented 1 year ago

Dear authors, Thank you very much for writing the code and detailed documentation for the two-stage diff-in-diff! I have a question about connecting did2s to honest did. On this documentation of honest did https://github.com/asheshrambachan/HonestDiD, there is an example of transforming the Callaway and Santanna result and then apply to the honest did function. I wonder if (and how) we can use to results from diid2s in honest did? Thank you very much for your suggestion!

kylebutts commented 1 year ago

I'm not sure I understand what you want on did2s's end. A did2s call returns a fixest object, you can follow the README and get betahat and sigma and use any function from HonestDiD as needed.

library(fixest)
library(data.table)
library(did2s)
library(HonestDiD)

data(df_hom)
twfe_results <- did2s(df_hom,
  yname = "dep_var", treatment = "treat", cluster_var = "state",
  first_stage = ~ 0 | unit + year,
  second_stage = ~ i(rel_year, ref = c(-1, Inf))
)

betahat <- summary(twfe_results)$coefficients # save the coefficients
sigma <- summary(twfe_results)$cov.scaled # save the covariance matrix

HonestDiD::createSensitivityResults_relativeMagnitudes(
  betahat = betahat, # coefficients
  sigma = sigma, # covariance matrix
  numPrePeriods = 20, # num. of pre-treatment coefs
  numPostPeriods = 20, # num. of post-treatment coefs
  Mbarvec = seq(0.5, 2, by = 0.5) # values of Mbar
)
Randol340 commented 1 year ago

Thank you very much for your reply! If I understand the honest did function correctly, the code you provide above is feasible only for DiD when all treated units are treated at the same time? If you go down to the section about "Staggered timing" (also here https://github.com/pedrohcgs/CS_RR), I wonder how we can make what did2s returns feasible to put in the honest_did.AGGTEobj function? Or do you think that's not possible as the honest_did.AGGTEobj function is very specific to CSDID? Thank you again!!

kylebutts commented 1 year ago

If I understand correctly, CS_RR aggregates $ATT(g,t)$ to event-time estimates $ATT^\ell$ (hence the honest_did.AGGTEobj and

if (es$type != "dynamic") {
  stop("need to pass in an event study")
}

and then does as I suggested. I'd be open to including something similar in this package, but since HonestDiD is not on CRAN, I can not add it and still have did2s be easily downloadable. Roughly, this is what you need:

library(fixest)
library(data.table)
library(did2s)
library(HonestDiD)

data(df_hom)
twfe_results <- did2s(df_hom,
  yname = "dep_var", treatment = "treat", cluster_var = "state",
  first_stage = ~ 0 | unit + year,
  second_stage = ~ i(rel_year, ref = c(-1, Inf))
)

# Define parameters
e <- 0
type <- c("smoothness", "relative_magnitude")
method <- NULL
bound <- "deviation from parallel trends"
Mvec <- NULL
Mbarvec <- NULL
monotonicityDirection <- NULL
biasDirection <- NULL
alpha <- 0.05
parallel <- FALSE
gridPoints <- 10^3
grid.ub <- NA
grid.lb <- NA

# From https://github.com/pedrohcgs/CS_RR
# honest_did
event_time <- gsub("rel_year::", "", names(coef(twfe_results))) |>
  as.numeric() # get event-time
beta <- summary(twfe_results)$coefficients # save the coefficients
V <- summary(twfe_results)$cov.scaled # save the covariance matrix

# Remove the coefficient normalized to zero
referencePeriodIndex <- which(event_time == -1)
V <- V[-referencePeriodIndex, -referencePeriodIndex]
beta <- es$att.egt[-referencePeriodIndex]

nperiods <- nrow(V)
npre <- sum(1 * (event_time < 0))
npost <- nperiods - npre

baseVec1 <- basisVector(index = (e + 1), size = npost)

if (type == "relative_magnitude") {
  if (is.null(method)) method <- "C-LF"
  robust_ci <- createSensitivityResults_relativeMagnitudes(
    betahat = es$att.egt, sigma = V,
    numPrePeriods = npre,
    numPostPeriods = npost,
    bound = bound,
    method = method,
    l_vec = baseVec1,
    Mbarvec = Mbarvec,
    monotonicityDirection = monotonicityDirection,
    biasDirection = biasDirection,
    alpha = alpha,
    gridPoints = 100,
    grid.lb = -1,
    grid.ub = 1,
    parallel = parallel
  )
} else if (type == "smoothness") {
  robust_ci <- createSensitivityResults(
    betahat = es$att.egt,
    sigma = V,
    numPrePeriods = npre,
    numPostPeriods = npost,
    method = method,
    l_vec = baseVec1,
    monotonicityDirection = monotonicityDirection,
    biasDirection = biasDirection,
    alpha = alpha,
    parallel = parallel
  )
}
Randol340 commented 1 year ago

Hi Kyle, thank you so much for your help!!

I encountered an error in running this line of code: I wonder if you have any ideas about how I can fix it? > event_time <- gsub("relquarter::", "", names(coef(twfe_results1))) |> as.numeric() # get event-time Warning: NAs introduced by coercion And event_time is NA_real_. The panel data is at quarter level and relquarter is the relative quarter to treatment with never treated coded as 999.

Thank you again!

kylebutts commented 1 year ago

I would need more information. Could you show me your did2s call?

Randol340 commented 1 year ago

Thank you so much for your willingness to help!!

Here is the code I run for the did2s. dat is an unbalanced panel at locid quarter level.

dat = dat[!is.na(dat$treat_q),]
setDT(dat)
dat$relquarter[is.na(dat$relquarter)] = 999 
dat$quarter_added[is.na(dat$quarter_added)] = 0

# Gardner estimation 
twfe_results1 <- did2s(dat, 
               yname = "lntot_rec", first_stage = ~ 0 | locid + quarter, 
               second_stage = ~i(treat_q, ref=0), treatment = "treat_q", 
               cluster_var = "locid")

### Event Study w/ Gardner
es_sample1 <- did2s(dat, 
                yname = "lntot_rec", first_stage = ~ 0 | locid + quarter, 
                second_stage = ~i(relquarter, ref=c(-1, 999)), treatment = "treat_q", 
               cluster_var = "locid")

pdf(paste(sample,"_gardner.pdf", sep=""))

fixest::iplot(es_sample1, main = "", ylab="", xlab = "Quarters from add", col = "steelblue", ref.line = -1, xlim=c(-8,8),ylim=c(-0.1,0.15))
axis(side=1,at=(-8:8))

dev.off()
kylebutts commented 1 year ago

Hmmm, I'm not sure why that's erroring. What's the output of names(coef(twfe_results1))? All that gsub is doing is removing relquarter:: from the coefficient names and then it converts the strings to numbers.

Randol340 commented 1 year ago

Sorry my bad, twfe_results1 is the overall effect and I should use es_sample1 which is the event study. When I run it, I find the event_time variable don't have -1, which makes the next command referencePeriodIndex <- which(event_time == -1) return empty. relquarter = -1 seems to be omitted, I wonder if you have any suggestions on how to deal with this? Thank you very much!!

> event_time
 [1] -43 -42 -41 -40 -39 -38 -37 -36 -35 -34 -33 -32 -31 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21
[24] -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10  -9  -8  -7  -6  -5  -4  -3  -2   0   1   2   3
[47]   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26
[70]  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41
> names(coef(es_sample1))
 [1] "relquarter::-43" "relquarter::-42" "relquarter::-41" "relquarter::-40" "relquarter::-39"
 [6] "relquarter::-38" "relquarter::-37" "relquarter::-36" "relquarter::-35" "relquarter::-34"
[11] "relquarter::-33" "relquarter::-32" "relquarter::-31" "relquarter::-30" "relquarter::-29"
[16] "relquarter::-28" "relquarter::-27" "relquarter::-26" "relquarter::-25" "relquarter::-24"
[21] "relquarter::-23" "relquarter::-22" "relquarter::-21" "relquarter::-20" "relquarter::-19"
[26] "relquarter::-18" "relquarter::-17" "relquarter::-16" "relquarter::-15" "relquarter::-14"
[31] "relquarter::-13" "relquarter::-12" "relquarter::-11" "relquarter::-10" "relquarter::-9" 
[36] "relquarter::-8"  "relquarter::-7"  "relquarter::-6"  "relquarter::-5"  "relquarter::-4" 
[41] "relquarter::-3"  "relquarter::-2"  "relquarter::0"   "relquarter::1"   "relquarter::2"  
[46] "relquarter::3"   "relquarter::4"   "relquarter::5"   "relquarter::6"   "relquarter::7"  
[51] "relquarter::8"   "relquarter::9"   "relquarter::10"  "relquarter::11"  "relquarter::12" 
[56] "relquarter::13"  "relquarter::14"  "relquarter::15"  "relquarter::16"  "relquarter::17" 
[61] "relquarter::18"  "relquarter::19"  "relquarter::20"  "relquarter::21"  "relquarter::22" 
[66] "relquarter::23"  "relquarter::24"  "relquarter::25"  "relquarter::26"  "relquarter::27" 
[71] "relquarter::28"  "relquarter::29"  "relquarter::30"  "relquarter::31"  "relquarter::32" 
[76] "relquarter::33"  "relquarter::34"  "relquarter::35"  "relquarter::36"  "relquarter::37" 
[81] "relquarter::38"  "relquarter::39"  "relquarter::40"  "relquarter::41" 
kylebutts commented 1 year ago

You don't need to drop -1 in the ref option. second_stage = ~i(relquarter, ref=c(999)) should do the trick

Randol340 commented 1 year ago

Thank you so much!

I encountered another problem when keep running the code: beta <- es_sample1$att.egt[-referencePeriodIndex] returns NULL as

> es_sample1$att.egt
NULL

I'm not sure if the att.egt is specific to the example using CSDID? Should I change it to beta <- es_sample1$coefficients[-referencePeriodIndex]?

And when I running the if statement when I choose type == "smoothness", I get the following error message:

} else if (type == "smoothness") {
  robust_ci <- createSensitivityResults(
    betahat = es_sample1$att.egt,
    sigma = V,
    numPrePeriods = npre,
    numPostPeriods = npost,
    method = method,
    l_vec = baseVec1,
    monotonicityDirection = monotonicityDirection,
    biasDirection = biasDirection,
    alpha = alpha,
    parallel = parallel
  )
}

Error in A_SD %*% prePeriod.coef : 
  requires numeric/complex matrix/vector arguments

if I change betahat = es_sample1$coefficients the error message shows Error in { : task 1 failed - "attempt to apply non-function"

Thank you again for your continuous help!!

kylebutts commented 1 year ago

Ooops, yeah you access the coefficients of a fixest object with coef(es_sample1). You need to drop the base-period, so coef(es_sample1)[-referencePeriodIndex]

Randol340 commented 1 year ago

Yes, I've changed this beta <- coef(es_sample1)[-referencePeriodIndex]. I think the problem now is the last step when I run this

robust_ci <- createSensitivityResults(
    betahat = beta,
    sigma = V,
    numPrePeriods = npre,
    numPostPeriods = npost,
    method = method,
    l_vec = baseVec1,
    monotonicityDirection = monotonicityDirection,
    biasDirection = biasDirection,
    alpha = alpha,
    parallel = parallel
  )

I got an error message Error in { : task 1 failed - "attempt to apply non-function". I wonder if you have any ideas? Thank you very much!

image
kylebutts commented 1 year ago

I'm not sure. I'm not the author of that package and don't know how it works. Sorry!

Randol340 commented 1 year ago

Got it, no worries! Thank you for your help throughout!!!