JuliaDiffinDiffs / DiffinDiffs.jl

A suite of Julia packages for difference-in-differences
MIT License
37 stars 2 forks source link

agg(did_res,:rel) BoundsError #29

Open eohne opened 1 month ago

eohne commented 1 month ago

Hi cool package! I just ran into an issue when trying to use the agg() function. The result of the @did call (using dynamic treatment) has 6965 treatweights and treatcounts but only 6942 coefficients. Not sure how that happened but it leads to a BoundsError when trying to aggregate.
Any suggestions?

Thanks in advance

Edit: I set the cohort variable for the never treated group to the last t in my sample (same as in your example) and the data contains no missing values when passed to the @did function.

junyuan-chen commented 1 month ago

Somehow there is some collinearity going on with the dummy variables and some of them are dropped when doing the OLS. Did you include additional regressors?

eohne commented 1 month ago

Thanks for the rapid reply.

I tried both with and without including additional regressors. If I include additional regressors only 4 items are missing from the coef vector rather than 23 without using additional regressors. I find this a bit strange - I would have thought more (or the same number of) coefficent etstimates would be missing if the issue was collinearity.

I would compare to the R fixest package but run out of memory there. Will do some further testing and report back. Just thought you may have encountered this before.

junyuan-chen commented 1 month ago

It still sounds to me like something about collinearity. The way how the collinear columns are dropped may not be so "exact" if there is any collinearity going on.

Some of the columns can be dropped in the following lines:

https://github.com/JuliaDiffinDiffs/DiffinDiffs.jl/blob/86288bb4e65e4ea6d2b765116c46c608b040c44f/lib/InteractionWeightedDIDs/src/procedures.jl#L469-L474

I would suggest trying to inspect exactly which columns are dropped and then take a look at the sample to get a better idea on how this can happens. There is a keepall=true option with @did that could be helpful to preserve basiscols. You would add a bracket like this: @did [keepall = true] ....