Closed michaeltopper1 closed 2 years ago
Hi Michael, thanks for reporting this! I am fairly certain that the first error is due to issues with the environment from which boottest() tries to fetch the data. I will take a closer look this evening! Best, Alex
Ok, it seems like I will have to take a deeper look at the section on environments in Advanced R for a "quick fix".
Alternatively, I might fully replace the call to model.frame
(which causes the troubles above) and work with model.matrix
instead. I thought about doing this anyways, as working with model.matrices
would allow to call boottest()
based on objects of type fixest with fixest formula sugar (e.g. i()
or ^
, which boottest()
currently does not support). It would also simplify the pre-processing code by a lot.
The main caveat is that its not straightforward to fetch the dependent variable without a model.frame
for models of type lm
- one workaround would be to re-create the dependent variable by using predict()
and residuals
methods as Y <- predict(obj) + residuals(obj)
. Unfortunately, there is no predict
method available for lfe::felm
, so I'd have to come up with another solution. For fixest, there is no problem - I can simply use fixest:::model.matrix.fixest(, type = "lhs")
. Also, it would be much easier to add support for the WRE after IV-estimation via feols()
.
Beyond, not relying on model.frame
would force me to change the way how cluster variables are passed to boottest()
- they either would have to be included in the model.matrix
, in which case users could specify the clusters as a character vector or formula. If not part of the model.matrix
, I would have to ask users to pass it to boottest()
as a vector of length N. Currently, if the cluster variable is not part of the model formula, I add it to the formula of the model call and evaluate the model.frame
on the updated formula.
Anyways, I am not sure if I managed to explain myself clearly 😄 - the main conclusion is that I'll have to think all of this through. If there's a quick fix, I will update the package during the week, but it is likelier that I will move the pre-processing from to model.matrices
over the weekend.
Ah I see. Thank you for the quick response and great explanation. In the meantime, I'll use my messy workaround. Looking forward to the changes!
Hi Michael,
Here's a quick workaround for your problem using rlang's !!
operator and lapply
- the main difference to your solution is that the code below updates the data part of the call to lm
before the call is evaluated, not after:
library(rlang)
library(modelsummary)
library(fwildclusterboot)
res <-
lapply(list(quote(mtcars), quote(mtcars)), function(x){
fit_uneval <- rlang::expr(lm(mpg ~ cyl, data = !!x))
fit <- eval(fit_uneval)
boottest(fit, param = "cyl", clustid = "gear", B = 333)
})
msummary(res, estimate = "{estimate} ({p.value})",
statistic = "[{conf.low}, {conf.high}]", output = "data.frame")
part term statistic Model 1 Model 2
1 estimates 1*cyl = 0 modelsummary_tmp1 -2.876 (0.000) -2.876 (0.000)
2 estimates 1*cyl = 0 modelsummary_tmp2 [-3.188, -2.307] [-3.188, -2.307]
3 gof Num.Obs. 32 32
4 gof R2 0.726 0.726
5 gof R2 Adj. 0.717 0.717
6 gof AIC 169.3 169.3
7 gof BIC 173.7 173.7
8 gof Log.Lik. -81.653 -81.653
Best, Alex
Thanks for this work around Alex! However, while the boottest works great using this method, I can't seem to get the goodness-of-fit statistics to work when I copy your code. My packages (except fwildclusterboot
) are all up to date. Is this something that could be fixed with fwildclusterboot
version 0.8 rather than 0.7? While I would gladly update to version 0.8, my computer (Mac) won't let me as I get the classic "There is a binary source available but the source version is later" error.
I don't think that your problem will be fixed by updating to fwildclusterboot 0.8, unfortunately. 0.8 is mostly about integrating WildBootTests.jl
and reorganizing some code. Nevertheless, I think the binaries for Mac should be on CRAN by now? If installation from CRAN does not work, you can try to install a compiled version (for mac) from r-universe by running
install.packages('fwildclusterboot', repos ='https://s3alfisc.r-universe.dev')
By the way, I am almost done moving from model.frame()
to model.matrix()
methods, and by now I am less optimistic that just doing that will let your map()
example run out of the box. I'll let you know once I manage to fix this :)
See this issue: https://github.com/kosukeimai/MatchIt/issues/53
Bug seems to be specific to fixest::feols()
, which evaluates in custom environment:
iter1 <-
lapply(
list(mtcars, mtcars), function(x){
fit <- feols(mpg ~ cyl, data = x)
}
)
iter2 <-
lapply(iter1, function(x){
boottest(x, param = "cyl", clustid = "gear", B = 333)
})
# Error in eval(model$call$data, envir) : object 'x' not found
iter2 <-
lapply(iter1, function(x){
sandwich::vcovCL(x, cluster = ~gear)
})
# Error in eval(model$call$data, envir) : object 'x' not found
but everything runs fine if feols()
is replaced with lm()
or felm()
iter1 <-
lapply(
list(mtcars, mtcars), function(x){
fit <- lm(mpg ~ cyl, data = x)
}
)
fixest::feols()
evaluates in a custom environment:
library(fixest)
library(lfe)
feols_fit <- feols(mpg ~ cyl, data = mtcars)
lm_fit <- lm(mpg ~ cyl, data = mtcars)
felm_fit <- felm(mpg ~ cyl, data = mtcars)
environment(terms(feols_fit))
# <environment: 0x0000029cd30926a8>
environment(terms(lm_fit))
# <environment: R_GlobalEnv>
environment(terms(felm_fit))
# <environment: R_GlobalEnv>
Solution is described here: the issue was - as expected - an incorrect environment used with expand.model.frame
. I have fixed the issue already in the code and will (hopefully) release a new version of fwildclusterboot
by tomorrow evening (EU time) 😃
With the latest commit ace208ebf5265a06f2d6a64bf7e020165ac9c827, it is possible to use boottest()
with fixest::feols()
and purrr::map()
/ lapply()
. You should be able to install the latest version from r-universe within the next hour (given all unit tests pass, they run for a while).
Unfortunately, the resulting list of objects of type boottest
is still not working with modelsummary::msummary()
:
library(tidyverse)
library(fixest)
library(fwildclusterboot)
library(modelsummary)
## creating a list of data
cars_list <- list(mtcars, mtcars)
## iterating through regressions
cars_output <- lapply(cars_list, \(x) feols(mpg ~ cyl, data = x))
## this cannot be evaluated - it believes object `.` is the data to be used
boot_list = lapply(cars_output, \(x) boottest(x, param = "cyl", clustid = "gear", B = 333))
msummary(model = boot_list,
estimate = "{estimate} ({p.value})",
statistic = "[{conf.low}, {conf.high}]")
Instead, you can simply add desired columns produced via fwildclusterboot::boottest()
to the modelsummary()
output to cars_output
:
library(tibble)
rows <- tribble(~term, ~OLS, ~Logit,
'cyl, boottest p-value', boot_list[[1]]$p_val, boot_list[[1]]$p_val,
'cyl, boottest conf.int', paste(round(boot_list[[1]]$conf_int,3)), paste(round(boot_list[[2]]$conf_int),3))
attr(rows, 'position') <- c(5,6)
modelsummary(cars_output, add_rows = rows)
Beautiful! Thanks for this update. Although it cannot work with modelsummary::msummary
, your work around is a nice and simple solution that I frequently use to customize tables. Thank you again and very happy that I will now be able to wildclusterboot my standard errors with lapply
and fixest
!
Hi, 'msummary' is actually just a shortcut for 'modelsummary", as far as I know :)
I've been trying to implement the bootstrapped standard errors on my regressions. However, to save code, I frequently use
lapply
orpurrr::map
to do this iteratively. However, I've noticed that this causes problems with theboottest
function. Specifically, it cannot grab the correct model objects when usinglapply
orpurrr::map
. See the below example:There is a workaround to this. I can do the following:
While this workaround DOES produce the desired results, exporting them to
modelsummary
gives errors when it comes to goodness-of-fit statistics. See here:I am wondering if this function can be updated so that this sort of framework can be supported.