bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
301 stars 95 forks source link

Spatial Spillover - Within Indicator Control #79

Closed apodges closed 3 years ago

apodges commented 3 years ago

The estimation of my effects are affected by spatial spillover. I read a recent article by Kyle Butts that focuses on how to address spatial spillover in the context of the traditional DID setup. In a nutshell, this involves first defining the spatial influence of treated units using a buffer or ring. A 'within indicator' can then be created to capture the relationship of each unit relative to these treated buffers.

For the sake of simplicity, the within indicator can take one of two forms: "For non-additive spillovers, I recommend a set of mutually-exclusive indicators which measures which distance bin that the closest treated unit falls within for a given observation. For additive spillovers, I recommend using the number of treated units within each distance bin for a given observation" (Butts, 2021, p. 3).

In the latter part of the article, he digs into an example of how to address spatial spillover in the context of staggered treatment adoption. Specifically, he discusses how to adapt Callaway and Sant'Anna's approach: " Estimation and different aggregation of the direct effects can be done using the ‘did’ package produced by C&S where the the ‘Within’ indicator is included as a covariate" (Butts, 2021, p. 29).

I am wondering whether it is truly that simple. Reading previous posts, there has been confusion surrounding how controls, specifically time-varying controls, are actually treated by the package. In the case in which an additive within indicator is considered, the 'did' package would take its value before the treatment takes place. It is not clear to me how this would address spatial spillover. Not to mention, I get the following message when I run a dr model with an additive within indicator using "notyetreated" units as controls: "Not enough control units."

Any insight on how to address spatial spillover using the 'did' package would be very much appreciated! As a workaround, I have isolated control units that are not affected by treated units, but this would only solve part of the problem since treated units can still affect one another. Perhaps this is the best that can be done using the 'did' package.

bcallaway11 commented 3 years ago

Yes, I know that paper, but let me make sure I understand something --- does "additive" spillover mean that the actual number of "bordering" units that are treated matters, while "non-additive" just means that only the closest treated unit matters --- is that right?

I think you are probably right that this seems tricky when treatment timing varies. Let me see if I can tag Kyle (@kylebutts) and see if he has any thoughts/suggestions for us.

kylebutts commented 3 years ago

@bcallaway11, you are correct on additive vs. non-additive spillovers.

As a workaround, I have isolated control units that are not affected by treated units

This certainly works, but it is albeit quite inefficient as the units affected by spillover can help estimate the counterfactual trend if they are not affected yet for that ATT(g,t). Admittedly, when I wrote that section, I didn't think too much about the practical ability to use did to estimate it. I still think what I wrote is correct (they are good controls until they start experiencing spillover effects), but I don't know if did is the correct way to estimate.

Sorry to plug my own package @bcallaway11 :-), but I think using a two-stage difference-in-difference might work better (but loses the doubly robust properties). I'm going to have a new draft of the paper in a few weeks that discusses this, but you could use a modified version of this method:

  1. Estimate time and unit fixed effects using observations (it) that are untreated and not experiencing spillovers

  2. Include spillover variables along with treatment variable in second stage.

, but this would only solve part of the problem since treated units can still affect one another. Perhaps this is the best that can be done using the 'did' package.

In the new draft, I am a bit more careful on defining treatment effects of interest:

image

So what you are estimating using the did package would be akin to the "total effect". I think that's still a policy relevant parameter, but slightly different from the direct effect

bcallaway11 commented 3 years ago

Thanks @kylebutts . I think you are right here. You and John are doing good work — it seems there is a lot to like about these kind of imputation estimators. I’ll keep this open for a few days in case @apodges wants to follow up on any details.

apodges commented 3 years ago

@bcallaway11 thank you for reaching out to @kylebutts. I apologize for my delayed response. I wasn't familiar with Gardner's (2021) approach, so had to do some reading to catch up. I also don't have a background in econometrics, so it takes me a while to get through and comprehend such manuscripts.

@kylebutts the modified version of Gardner's (2021) approach that you propose makes sense to me. I could be overlooking something, but in reviewing the R and Stata packages it doesn't seem possible to amend the second stage component to include time-varying controls, such as an additive within indicator. Am I off base here? I could manually estimate the first and second stage components, as shown here by Scott Cunningham. I realize this would result in biased standard errors in the estimation of the second stage component. Unfortunately, I am largely bound to relying on packages given my limited background in econometrics; I wouldn't know where to begin to adjust the standard errors by-hand.

I realize this discussion is going beyond Callaway and Sant'Anna's approach. If necessary, I can make a separate post on did2s' github page.

Again, many thanks for generating such a great discussion. @kylebutts I look forward to reading the new draft of your manuscript.

kylebutts commented 3 years ago

For the second stage, you can just do

second_stage = ~ spill + treat

where spill is/are the spillover variable(s)

apodges commented 3 years ago

Thanks, @kylebutts! I will give it a try and will carry over any further related issues/discussions to the did2s page.