Closed adebiasi21 closed 3 years ago
This is tricky question, and I think the answer sort of depends on what viewpoint you take:
Mechanically, the answer is yes. You can just "ignore" that the outcome is a count variable and run the code in the did
package. If you interpret parallel trends (the main assumption underlying DID) as being just a reduced form assumption that might hold in the data (which you could in some ways check in pre-treatment periods), then this would be reasonable. My sense is that this is very commonly done in applied work (both with count data or the same sorts of issues would come up if you had, say, a binary outcome).
That being said, it is going to be very hard to write down a model for potential outcomes like a poisson regression with fixed effects that leads to parallel trends holding. I'm sensitive to this point (perhaps more than most), but I think that carefully checking pre-trends and ignoring issues related to nonlinear models is very common.
One last comment, I don't think there is going to be a "fix" for this if you were to use TWFE, our approach, or some other approach. This is a more general issue of fixed effects in nonlinear models.
Brant
I greatly appreciate your thorough response, Brant!
I have one more related question, which I am ashamed to say is rather rudimentary.
In this context, how should treatment effects be interpreted for count outcomes? In the employment example provided by the authors, treatment effects are reported in terms of a percentage decrease/increase. I suspect this may have something to do with their logged outcome. Or, am I off base here?
I think you should just interpret as how much outcomes increase on average for the treated group due to participating in the treatment --- so I don't think there is anything different about interpreting the results due to the count outcome relative to a continuous outcome. I think the bigger issue is the one we were talking about previously.
You are also right about the interpretation in our application being due to the logged outcome.
Does the did package accommodate count outcomes? If so, how is this achieved? Or, is this question a moot point? In the TWFE framework (which I am trying to get away from), I have used negative binomial or poisson regression to model count outcomes.