chaisemartinPackages / did_multiplegt_dyn

|| Stata | R || Estimation of event-study Difference-in-Difference (DID) estimators in designs with multiple groups and periods, and with a potentially non-binary treatment that may increase or decrease multiple times.
23 stars 6 forks source link

Inquiry Regarding Control Variables Specification in did_multiplegt_dyn Command #43

Closed LuJiangYun closed 3 weeks ago

LuJiangYun commented 1 month ago

I am currently utilizing the did_multiplegt_dyn command you developed for a difference-in-differences analysis and have encountered a technical issue regarding how to handle the interaction effects of time-invariant variables with time dummies.

Specifically, I am analyzing the impact of certain policies and am attempting to control for initial values of specific city indicators, such as per capita GDP and city size in the first year of my data period, within the controls() option. However, whether I include initial value controls, initial values year dummies, or variables that change yearly (such as specific annual per capita GDP), the regression results remain identical to those without any controls(). The results only differ when I generate Xg,t=X_initialT, controlling for time-invariant controls * year.

I noted that the command documentation mentions the possibility of controlling for the interaction of time-invariant control variables with time trends, but it does not specify how to handle interactions with year dummies. Therefore, I kindly ask if you could help clarify a few points:

(1)Does the did_multiplegt_dyn command support including interaction terms between control variables and time dummies in the model? (2)If so, how should I correctly specify these interactions to ensure they are correctly recognized and utilized by the model? (3)If not, do you have any recommended methods or alternative commands to address this limitation?

ElenaBarham15 commented 3 weeks ago

I would like to add to this question whether it is possible to get the R version of the did_multiplegt_dyn output to give coefficient estimates for controls as well as the treatment variable?

chaisemartinPackages commented 3 weeks ago

Dear all, This is Diego from Clément de Chaisemartin's RA Team. Thanks for your interest in did_multiplegt_dyn!

Let me start from the latest inquiry. The formal description of the way in which did_multiplegt_dyn accounts for covariates is contained in Section 1.2 of the Web Appendix of de Chaisemartin and D'Haultfoeuille (2024). In short, the long difference of the outcome variable $Y{g,t}$ is decreased by the correspondent long difference of the covariates $X{g,t}$, multiplied by the cofficient on $\Delta X{g,t}$ from an OLS regression of $\Delta Y{g,t}$ on $\Delta X_{g,t}$ and time fixed effects on the sample of not-yet-treated with the same status quo treatment. These coefficients are only meant to adjust the evolution of the outcome following Assumption 11 (parallel trends with covariates). As a result, the program does not yield them among the results.

As for the first inquiry, the way in which controls can be specified in did_multiplegt_dyn is described in Section 2.2 of the companion paper. In a few words, controlling for both for time-varying and time-invariant covariates is supported. In the latter case, you should make the covariates time-varying, either by (a) controlling for the linear time trend $tX_g$ or (b) controlling for the interactions of the time-invariant covariates and $1\lbrace t \geq t'\rbrace$ for $t' \in \lbrace 2, ..., T\rbrace$ . Notice that the latter solution does not require the interaction with $1\lbrace t \geq 1\rbrace$, since its first difference is always 0 and, as a result, it would be dropped from the residualization.

The code below implements both methods with randomly generated data:

clear
set seed 0
set obs 1000
gen G = mod(_n-1, 100) + 1
bys G: gen T = _n
sort G T
gen D = uniform() > 0.5 & T >= 5
gen Y = uniform() * (1 + D)

gen X_temp = uniform()
bys G: egen X = sum(X_temp)
drop X_temp

sum T
forv j = 2/`=r(max)' {
    gen T_dummy`j' = T >= `j'
}
foreach v of varlist T_dummy* {
    replace `v' = `v' * X
}
gen T_trend = X * T

did_multiplegt_dyn Y G T D, graph_off
did_multiplegt_dyn Y G T D, controls(T_trend) graph_off
did_multiplegt_dyn Y G T D, controls(T_dummy*) graph_off

I hope this helps! Best, Diego

lbiagini75 commented 6 days ago

Is there the same option for R ?

chaisemartinPackages commented 3 days ago

Hi Luigi, Yes, all the options available in the Stata version can also be used in the R version. You can load the DIDmultiplegtDYN library and run "?did_multiplegt_dyn" to check the appropriate syntax. Best, Diego