naoki-egami / DIDdesign

R package DIDdesign: Analyzing Difference-in-Differences Design
GNU General Public License v2.0
35 stars 7 forks source link

Error in thres option #25

Closed Ales-G closed 3 years ago

Ales-G commented 3 years ago

Dear Prok Naoki, First of all, thank you for the excellent work! this is really an amazing package.

I am writing to you because I am trying to run the did_check with a staggered adoption design but I am getting an error.

My code is the following

check_sa <- did_check(
  formula = y ~ x,
  data    = df,
  id_unit = "id",
  id_time = "month",
  design  = "sa",
  option  = list(n_boot = 200, parallel = TRUE,  thres = 1, lag = 1:3)
)

#Error in if (sum(n_treated[n_treated >= thres]) == nrow(Gmat)) { : 
 # missing value where TRUE/FALSE needed 

Do you know what might be causing this issue and how I could address it?

I fear it may relate to the fact that my data is highly unbalanced. The reason is simple. I have monthly data and my units were observed and treated in different months. Could this be the problem? Do you have suggestions on how I could approach the issue?

Thanks a lot in advance for your help

best

sou412 commented 3 years ago

@Ales-G Thank you for reporting an issue! We're currently working on an extension to handle an unbalanced panel in the SA design, but the current version can only handle a balanced panel.

So, could you try with a modified dataset that only contains units that are observed for the entire period?

Let me know if you encounter further issues! Soichiro

Ales-G commented 3 years ago

Dear Prof Naoki, thanks a lot for your kind reply. It is great to know you are working on an extension of the package.

Unfortunately, I am not sure how we could run our model for "units that are observed for the entire period".

Our data consists of a series of audits. Producers are audited at different times (throughout multiple year) and with different frequency. While some are audited multiple times within the same year, other are just audited twice in the space of 4 years. Producers are treated (at different times) in between audits. Hence, while we use month-year fixed effects (FE), the time dimension of our panel is extremely different for every unit. Some producers are observed in may 2013 others in june 2013 others in August 2014 etc.. I fear that with this set-up it would be impossible to build a balanced panel. However, if you believe that this set-up is not an insurmountable obstacle or you have any idea on how to run your double DiD model under these circumstances please do let me know and I would be immensely grateful.

Thanks a lot for your amazing work

Best regards

naoki-egami commented 3 years ago

Thanks for your question.

Our package focuses on the basic DID design (there is only one time period of the treatment assignment) and the staggered adoption design (units receive the treatment at different timing, but they remain exposed to the treatment once they are treated). Please see the examples on the front page and our paper. It seems that your data have no such clear pattern of the treatment assignment. Unfortunately, our package does not support such data type as it would require much stronger assumptions than what we describe in our paper.