Closed zeileis closed 10 months ago
Hi @zeileis , thanks for reporting!
I think I know what's going on: in order to invert the bootstrap test, I evaluate a function that produces p-values under different nulls. This gives me a set of p-values. You can inspect them via the plot()
method. E.g. for a null of beta = 1.01
, boottest
produces results and the following plot below:
data("PetersenCL", package = "sandwich")
m <- lm(y ~ x, data = PetersenCL)
rm(boot)
boot <- boottest(m, param = "x", B = 199, clustid = "firm", r = 1.01)
plot(boot)
What happens in the example that you describe is that the algorithms likely does not find proper starting values (as marked by the blue lines) - or in other words, there is only one computed bootstrapped p-value above the significance level of 0.05
.
I'll try to tackle this early next week - reworking the p-value inversion has been on my to do list for a while =) I also have a discussion of how to obtain p-values by CI inversion here - maybe it helps if by "we" you are referring to one of your students =)
If you do not need CIs for your simulations, you can set the conf_int = FALSE
function argument, and then you'll get rid of this error.
Btw, I'm looking forward to learn more about your simulation results! =)
Best, Alex
Hi, I have put a fix in the dev branch. In a nutshell, I add a user option to overwrite the default starting value for finding a starting value (😅 ). This is how you get around the error:
library(fwildclusterboot)
data("PetersenCL", package = "sandwich")
m <- lm(y ~ x, data = PetersenCL)
coef(m) # 0.02967972 1.03483344
dqrng::dqset.seed(3)
# throws an error
boot <- boottest(m, param = "x", B = 199, clustid = "firm", r = 1.03)
# runs with better starting value
boot <- boottest(m, param = "x", B = 1999, clustid = "firm", r = 1.03, se_guess = 0.001)
plot(boot)
summary(boot)
# boottest.lm(object = m, param = "x", B = 1999, clustid = "firm",
# r = 1.03, se_guess = 0.001)
#
# Hypothesis: 1*x = 1.03
# Observations: 5000
# Bootstr. Type: rademacher
# Clustering: 1-way
# Confidence Sets: 95%
# Number of Clusters: 500
#
# term estimate statistic p.value conf.low conf.high
# 1 1*x = 1.03 1.035 0.096 0.925 0.934 1.134
I still have to fix some issues in the dev branch before merging, but this feature should be good to use.
Thanks for looking into this!
In fact, we are only using boottest()
for computing the confidence interval (because we want to compare it to Wald-type confidence intervals with different bootstrap standard errors). As the confidence interval is almost invariant to the value of r
, we simply make sure that the value of r
we use is not too close to the empirical estimate.
This made me wonder whether it is not possible to find a better workaround that does not require the user to change a rather technical argument. Wouldn't it be possible to just shift or extend the search grid if needed?
This made me wonder whether it is not possible to find a better workaround that does not require the user to change a rather technical argument. Wouldn't it be possible to just shift or extend the search grid if needed?
Yes, you're right - it should not even be that hard. I was just somewhat lazy yesterday, and so I went for the lazy solution 😅 . I'll try to get to it tomorrow evening!
This should be fixed with #142. Out of the box, you can now impose r = 1.0348
and get sensible results:
library(fwildclusterboot)
data("PetersenCL", package = "sandwich")
m <- lm(y ~ x, data = PetersenCL)
summary(m)
boot <- boottest(m, param = "x", B = 199, clustid = "firm", r = 1.0348)
summary(boot)
# boottest.lm(object = m, param = "x", B = 199, clustid = "firm",
# r = 1.0348)
#
# Hypothesis: 1*x = 1.0348
# Observations: 5000
# Bootstr. Type: rademacher
# Clustering: 1-way
# Confidence Sets: 95%
# Number of Clusters: 500
#
# term estimate statistic p.value conf.low conf.high
# 1 1*x = 1.0348 1.035 0.001 1 0.951 1.117
plot(boot)
You should be able to install a compiled version from r-universe by running
install.packages('fwildclusterboot', repos ='https://s3alfisc.r-universe.dev')
in a bit =) It will be a while before I submit to CRAN - there are a couple of changes in the dev branch that I still need to finalize.
Nice, thanks for your quick fix!
When running some simulations we stumbled upon the following problem: When the value under the null hypothesis
r
is too close to the estimated coefficientparam
to be tested, then auniroot()
call inboottest()
fails.A minimal example using the
PetersenCL
data fromsandwich
with R 4.3.1 andfwildclusterboot
0.13.0 is:When digging into the code, I observed that the error occurs in
invert_p_val()
. This sets upgrid_vals
aroundpoint_estimate
and computesp_val_null2_x()
for each of these. But as all of these return-0.05
here, there are no crossings which leads toNA
s being passed touniroot()
. Maybe this is helpful for you to see where this comes from.