bcallaway11 / did

Difference in Differences with Multiple Periods, website: https://bcallaway11.github.io/did
287 stars 91 forks source link

Continuous Treatment? #195

Open lorenzoffner opened 2 months ago

lorenzoffner commented 2 months ago

Hi!

Thanks for the awesome package!

I am doing a research project and try to assess the impact of Renewable Portfolio Standards on green innovation (in form of patent data) in US states. In my case, the treatments do not only occur in staggered form, but are also continuous and thus vary for pretty much every unit. That means that, for example, California has a RPS of roughly 40% in 2023, whereas states like Ohio require only 6,5% of their energy to be renewable. Still, with your package and the underlying logic of how the data needs to be structured, both states count as treated.

I have already seen that you and Goodman-Bacon are working on a paper dealing with this precise issue (10.3386/w32117). I have also done a fair bit of browsing online, but have not yet discovered a package in R that can deal with a continuous and staggered treatment. Thus, I reverted to use the did package for my issue and have the following questions:

  1. Are you aware of any package that could work with a staggered and continuous treatment? Is there any chance that this function will be added some day to the did package?

  2. Would it make sense to include my continuous treatment variable as covariates? Or is there any other way in the did package to control for the varying treatment intensities?

  3. Finally, in my case, RPS range far into the future (i.e. 2050) and consequently impact the treated units already today. Do you have ideas on how to capture this as well? My only idea (again) would have been to somehow codify these values as covariates.

I hope I could make my issue clear enough for you. Any help would be greatly appreciated!

Best Lorenz

bcallaway11 commented 2 months ago

Hello @lorenzoffner,

  1. I am not aware of any package with a continuous and staggered treatment. We are working on putting out some code on this front.
  2. In the meantime, I'd recommend doing something slightly different from what you mention here. My suggestion is to create subgroups based on the treatment intensity and run the existing code in did separately for each of those subgroups.
  3. I'm not sure that I understand this one, are these anticipation effects?

Hope this helps, Brant

lorenzoffner commented 2 months ago

Hi @bcallaway11!

Thanks for your swift response, I deeply appreciate it!

  1. Thanks for the clarification! Do you happen to know a rough timeline or an expected release date maybe?
  2. That's a nice idea, thank you! I also figured out that including my continuous treatment variable as a covariate produces an error. I assume this violates the parallel trends assumption, as the continuous treatment only kicks in as the treatment dummy turns to 1. Am I correct?
  3. Yes, these are basically anticipation effects, but work a bit differently. All my treated units in my period of interest (until 2020) also have a RPS goal that is (in most cases) increasing until 2050. That means that all treated units until 2020 will also be treated in 2050, however at an increasing intensity (the continuous treatment). Consider the case in California, where the current treatment sits at 40% and the one of 2050 will be 100%. What I would be interested is to find out whether units with a certain level of treatment intensity in the future have different outcomes and whether more stringent legislation might lead to higher innovation (my outcome variable). Generally, I am having a similar problem as in 2) I guess – just with the added complexity of looking into the future by 25 years. I thought about including the future treatment variable as a lagged covariate, but this would probably cause similar issues as in 2), right?

I hope I could explain my issue more thoroughly now. Just let me know if you need further elaboration from my side. Thank you very much, Lorenz

simonschoe commented 2 months ago
  1. In the meantime, I'd recommend doing something slightly different from what you mention here. My suggestion is to create subgroups based on the treatment intensity and run the existing code in did separately for each of those subgroups.

Hi @bcallaway11, do you happen to know a vignette or example that illustrates how this would work using did (in particular how to partition the data and which groups to compare specifically)?