s3alfisc / fwildclusterboot

Fast Wild Cluster Bootstrap Inference for Regression Models / OLS in R. Additionally, R port to WildBootTests.jl via the JuliaConnectoR.
https://s3alfisc.github.io/fwildclusterboot/
GNU General Public License v3.0
24 stars 4 forks source link

Compatability with did2s #136

Closed michaeltopper1 closed 1 year ago

michaeltopper1 commented 1 year ago

Hello!

I am trying to get the fwildclusterboot::boottest function to work with did2s. This used to work in previous versions (I believe in 0.7 it works) of the packages, but has since lost capatability.

Here is a reproducible example using an example from the did2s vignette and the boottest function:

library(fwildclusterboot)
library(did2s)

static <- did2s(df_het,
                yname = "dep_var", first_stage = ~ 0 | state + year,
                second_stage = ~ i(treat, ref = FALSE), treatment = "treat",
                cluster_var = "state")

boottest(static, clustid = c("state"), B = 9999, param = "treat")

Unfortunately, I get the following error: Error in fixest::feols(fml = dep_var ~ 1, data = data, weights = weights_vector, : Argument 'data' must be either: i) a matrix, or ii) a data.frame. Problem: it is not a matrix nor a data.frame (instead it is a function).

I believe that this has to do with how boottest is grabbing the data from the fixest object, and in turn, how did2s is storing the data. It looks like did2s is naming the data object as data which boottest can then not find.

Any guidance on a possible workaround? I know I could possibly downgrade the version of fwildclusterboot to something prior, but the speedups from the newer versions make me less inclined to do this.

Thanks!

s3alfisc commented 1 year ago

Hi @michaeltopper1 - unfortunately, fwildclusterboot does not work with did2s, and I think that it might not going forward. The issue is that did2s conducts inference based on a GMM procedure, while the wild cluster bootstrap in fwildclusterboot only applies to the OLS case. Maybe I should add a dedicated error message for this use case? But you are definitely right that the error you encounter has something to do with the environments boottest() searches for data =)

If you want to make your inference robust to a small number of (treated) clusters, here are two options: first, you could try out did2s with CRV3 inference, as implemented via my summclust package or sandwich::vcovJK() - this seems to perform almost as good as the wild cluster bootstrap according to MacKinnon, Nielsen & Webb (see their fast and reliable paper). I think you can pass a custom vcov matrix to did2s? Alternatively, fwildclusterboot does support sunab() (as this is nothing but 'fancy OLS') via it's boot_aggregate() function. Yet another option: I have an open PR for etwfe that still needs some final touches + a PR to be merged into fixest. But these are of course only options if you do not have any particular reasons to run did2s vs the other methods.

I hope this helps!

michaeltopper1 commented 1 year ago

Hi @s3alfisc. No worries, and thank you very much for the alternative suggestions! I'll definitely check out your other package summclust.

Thanks!