bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
288 stars 92 forks source link

att´s with not yet treated controls and anticipation #144

Open vstegmaier opened 1 year ago

vstegmaier commented 1 year ago

Hi, I am using version 2.1.2 of did to estimate the group time att with a control group of only "not yet treated" units and 1 period of anticipation. I have 7 periods with 7 groups that will eventually be treated. When I omit all units never treated, I get NA values for all groups in the fifth period and the coefficients are different from those I calculated manually. I also get a warning that no control units are available for periods 4 and 5 (although the att for period 4 is estimated via the package).

data.did.notyet<- subset(data.did.all, data.did.all$group!= 0)

If I also manually omit the first (or first two) period(s), the coefficients are "correct" (for my understanding) and the estimates for the fifth period are calculated without any warnings.

data.did.notyet <- subset(data.did.all, data.did.all$group>1) or data.did.notyet <- subset(data.did.all, data.did.all$group>2)

What is the difference between subsetting group > 0 and group > 1 (or group > 2)? Since groups 1 and 2 are not used in the estimation, the results should be the same as only omitting group 0.

Code: green_attgt_did <- att_gt(yname = "Y", tname = "period", idname = "ID", gname = "group", control_group = c("notyettreated"), data = data.did.notyet, allow_unbalanced_panel = T, anticipation = 1, panel= T, alp = 0.05 )

Thank you for your help, Vincent