This PR adds support for inference via a wild (cluster) bootstrap by adding a bootstrap argument to etwfe (only for OLS). If bootstrap = TRUE, etwfe will compute marginal effects by calling fwildclusterboot::boot_aggregate(), which is a copy of fixest::aggregate().
It currently depends on a fork of fwildclusterboot, which in itself depends on a fork of fixest by @kylebutts, which introduces support or sparse model matrices. In other words, merging this PR will require another PR to be merged into fixest.
At the moment, this PR simply
adds a bootstrap argument to emfx. If bootstrap = TRUE, it will run a wild cluster bootstrap via the fwildclusterboot package
in consequence, fwildclusterboot is added as a (soft) dependency in Suggests
at the moment, only type = "simple" and the "clustered" bootstrap are supported
The PR still
[ ] ...requires @kylebutts's PR to be merged into fixest
[ ] ... and fwildclusterboot being updated afterwards
[ ] ... lacks support for the heteroskedastic bootstrap
[ ] .... lacks some defensive checks
[ ] ... lacks unit tests
[ ] ... lacks documentation in the vignette
[ ] I will also have to revert all changes to etwfe (it's only white space changes, sorry about that).
It is also worth discussing how to unify the output, i.e. running marginaleffects will return a marginaleffects object, while running the bootstrap will simply return a data.frame.
Here is some example code:
library(devtools)
install_github("https://github.com/s3alfisc/fwildclusterboot/tree/etwfe-support")
# this should install kyle's fork of fixest, if not, do it manually
#install_github("https://github.com/kylebutts/fixest/tree/sparse-matrix")
library(etwfe)
library(fwildclusterboot)
data("mpdta", package="did")
mod = etwfe(
fml = lemp ~ lpop,
tvar = year,
gvar = first.treat,
data = mpdta,
#se = "hetero",
vcov = ~countyreal,
ssc = fixest::ssc(adj = FALSE, cluster.adj = FALSE)
)
#names(coef(mod))
emfx(mod)
# Term Contrast .Dtreat Estimate Std. Error
# .Dtreat mean(TRUE) - mean(FALSE) TRUE -0.0506 0.0124
# z Pr(>|z|) S 2.5 % 97.5 %
# -4.08 <0.001 14.4 -0.075 -0.0263
emfx(mod, bootstrap = TRUE, B = 99999, nthreads = 2)
# Run the wild bootstrap: this might take some time...(but hopefully not too much time =) ).
# |======================================================| 100% Estimate t value Pr(>|t|) [0.025% 0.975%]
# [1,] -0.05062703 -4.078845 6.00006e-05 -0.07550813 -0.02580929
# Warning messages:
# 1: In emfx(mod, bootstrap = TRUE, B = 99999, nthreads = 2) :
# The bootstrap does not support the ssc() argument `fixef.K='none'`. Using `fixef.K='none' instead. This will lead to a slightly different non-bootstrapped t-statistic`, but will not affect bootstrapped p-values and CIs.
# 2: Matrix inversion failure: Using a generalized inverse instead.
# Check the produced t-statistic, does it match the one of your
# regression package (under the same small sample correction)? If
# yes, this is likely not something to worry about.
This PR adds support for inference via a wild (cluster) bootstrap by adding a
bootstrap
argument toetwfe
(only for OLS). Ifbootstrap = TRUE
, etwfe will compute marginal effects by callingfwildclusterboot::boot_aggregate()
, which is a copy offixest::aggregate()
.It currently depends on a fork of
fwildclusterboot
, which in itself depends on a fork offixest
by @kylebutts, which introduces support or sparse model matrices. In other words, merging this PR will require another PR to be merged into fixest.At the moment, this PR simply
bootstrap
argument toemfx
. Ifbootstrap = TRUE
, it will run a wild cluster bootstrap via thefwildclusterboot
packagefwildclusterboot
is added as a (soft) dependency inSuggests
type = "simple"
and the "clustered" bootstrap are supportedThe PR still
fixest
fwildclusterboot
being updated afterwardsetwfe
(it's only white space changes, sorry about that).It is also worth discussing how to unify the output, i.e. running
marginaleffects
will return amarginaleffects
object, while running the bootstrap will simply return adata.frame
.Here is some example code:
@jtorcasso fyi