Open mattansb opened 3 years ago
As fa as I know, there is no real "non-parametric" bootstrapping for merMod, but just "semi-parametric". Not sure if this will bring us closer to consistency, or if it is preferred to stick to the defaults?
In the docs the options at parametric of semiparametric - there is no non-parametric.
https://www.rdocumentation.org/packages/lme4/versions/1.1-27/topics/bootMer
Yeah, parametric is really the only stable option for bootMer. Semiparametric is pretty unstable; it gives weird results fairly often in my experience and is applicable to only a limited set of models. Resampling cases for mixed effects models is tricky (basically has to happen at the cluster level) and is pretty intractable for crossed models.
There is this package, but I've not investigated it https://cran.r-project.org/web/packages/lmeresampler/lmeresampler.pdf
Do we accept and pass along the use.u option to bootMer? If not, we should.
I'd also like to support other bootstrap intervals besides the percentile (eg, BCa). Opened an issue for this earlier today.
I might have triggered this issue with a query that I sent yesterday to a few of you (apologies for the redundancy).
I feel that there should be consistency in the call of model_parameters and perhaps a much clearer wording in the help and the vignettes.
For the "typical" sizes (a few dozens, < 100?) of datasets analyzed by scientists, I think that it does matter whether non-parametric or parametric bootstrap is used to estimate the 95%CIs and p-values. Non-parametric bootstrapping of a small sample until doomsday will never generate values on the tails of the real distribution of the response variable that are key to 95%CI and p-values. Regardless of the prob distribution of the response variable of a model, values in the upper/lower 10% are by definition unlikely to be represented in a sample of moderate size. Hence, the 95%CIs and p-values obtained by non-parametric bootstrap are bound to be somewhat biased and imprecise.
Provided that one has a statistical model with a reasonable goodness of fit to data, parametric bootstrap should almost always provide better 95%CI and p-values than non-parametric one. The reason is that generating pseudosample values in the upper/lower 10% from a prob distribution reasonably fitted by a model depends on the number of bootstrap replications (i.e. computer brute force) and not on the size of the original data.
Almost certainly, the differences in CIs and p-value between nonparametric vs parametric bootstrap disappear for large (>250?) sample sizes, but in these cases one should probably rely on the asymptotics of Wald statistics to get the CIs and p-values.
Other R libraries (e.g. glmmboot) doing bootstrap on these models employ such "weasel wording" in the description of their methods that one must query the authors to know what they actually do. This is why I feel that the excellent set of libraries of your "ecosystem" deserve a better and clerer description for the benefit of their users (like me). Cheers Pablo
I feel that there should be consistency in the call of model_parameters and perhaps a much clearer wording in the help and the vignettes
In which way, which part could probably be improved?
If I might be bold for a minute, I would suggest three things; a) Use systematically parametric bootstrap to generate 95%CI and p-values for any model (lm, glm, glmer, lmer, glmmTMB, and the other supported in your package parameters). A user can currently "trick" bootstrap_model into doing parametric bootstrap by fitting a lm, glm, glmer, lmer and others using glmmTMB because the latter encompasses all other models as "special cases". But imposing parametric boosttrap for any model should be done more elegantly by suitable programming. b) Explain clearly in the help of bootstrap_model and bootstrap_parameters what method (parametric, non-parametric) is actually used, and why, pointing to a fuller explanation in the vignette. c) Explain in the vignette why it is nearly always preferable to use parametric bootstrap. If the text I posted before is useful, edit and use it, or ask me to rewrite it (without any strings attached; the package and the hard work is yours, not mine). If useful for your purposes, I can send you an example of binary GLM with 200+ data points (that fits well) showing that pvalues estimated by Wald stats and paramboots may differ by orders of magnitude, and likewise that 95%CI limits by 200% at times. Nonparametric bootstrap may be even worse and is much harder to implement when data is structured by groups (as in random effects) or has a temporal or a spatial structure and these are the types of data that scientists and engineers increasingly have to analyze.
hm, option "parallel" for boot()
doesn't really return sensible results?
model <- lm(mpg ~ wt + factor(cyl), data = mtcars)
boot_function <- function(model, data, indices) {
d <- mtcars[indices, ] # allows boot to select sample
fit <- stats::update(model, data = d)
params <- insight::get_parameters(fit)
n_params <- insight::n_parameters(model)
if (nrow(params) != n_params) {
params <- stats::setNames(rep.int(NA, n_params), params$Parameter)
} else {
params <- stats::setNames(params$Estimate, params$Parameter) # Transform to named vector
}
return(params)
}
set.seed(123)
results <- boot::boot(
data = mtcars,
statistic = boot_function,
R = 100,
sim = "ordinary",
model = model
)
as.data.frame(results$t) |> head()
#> V1 V2 V3 V4
#> 1 33.40302 -3.247039 -3.697935 -5.852254
#> 2 35.10031 -3.802603 -3.918454 -4.430873
#> 3 33.47632 -2.927751 -4.632131 -5.792161
#> 4 33.55433 -2.763675 -4.787897 -7.392401
#> 5 29.48630 -2.436482 -2.511552 -4.973958
#> 6 34.77243 -3.755184 -3.219585 -4.911665
set.seed(123)
results <- boot::boot(
data = mtcars,
statistic = boot_function,
R = 100,
sim = "parametric",
model = model
)
as.data.frame(results$t) |> head()
#> V1 V2 V3 V4
#> 1 33.99079 -3.205613 -4.255582 -6.07086
#> 2 33.99079 -3.205613 -4.255582 -6.07086
#> 3 33.99079 -3.205613 -4.255582 -6.07086
#> 4 33.99079 -3.205613 -4.255582 -6.07086
#> 5 33.99079 -3.205613 -4.255582 -6.07086
#> 6 33.99079 -3.205613 -4.255582 -6.07086
Created on 2021-05-27 by the reprex package (v2.0.0)
library(parameters)
model <- lm(mpg ~ wt + factor(cyl), data = mtcars)
set.seed(123)
model_parameters(model, bootstrap = TRUE, iterations = 100)
#> Parameter | Coefficient | 95% CI | p
#> --------------------------------------------------
#> (Intercept) | 34.21 | [29.50, 39.25] | 0.010
#> wt | -3.09 | [-5.02, -1.98] | 0.010
#> cyl [6] | -4.33 | [-6.49, -1.61] | 0.010
#> cyl [8] | -5.99 | [-9.17, -2.71] | 0.010
set.seed(123)
model_parameters(model, bootstrap = TRUE, iterations = 100, type = "parametric")
#> Parameter | Coefficient | 95% CI | p
#> --------------------------------------------------
#> (Intercept) | 33.99 | [33.99, 33.99] | 0.010
#> wt | -3.21 | [-3.21, -3.21] | 0.010
#> cyl [6] | -4.26 | [-4.26, -4.26] | 0.010
#> cyl [8] | -6.07 | [-6.07, -6.07] | 0.010
Created on 2021-05-27 by the reprex package (v2.0.0)
@strengejacke I think it's a little weird that the reported point estimates are changing when bootstrap approaches are used. I think most users requesting bootstrap intervals would be expecting the maximum likelihood estimates along with the bootstrap intervals.
If model_parameters(model, bootstrap = TRUE, iterations = 100, type = "parametric") does parametric bootstrap for all common models (lm, glm, glmer, lmer, glmmTMB, etc), I might have provoked a storm on a tea cup. In my view, it is still strange that the user has to reach to "See 'Examples' in model_parameters.default" in the help of model_parameters to know that he/she can carry out parametric bootstrap, and that nothing in this regard is said in the help of bootstrap_model and bootstrap_parameters. If so, I should think that some editorial work in the help of the latter functions and in the vignette would solve my original query. I still stand by the relative merits of parametric vs nonparametric bootstrap from before. If agree that users expect to see their ML estimates in the summary rather than the bootstrapped (median, mean) estimates. The curious results of the last table (estimates equal the limits of the 95%CI) can only be wrong in my view.
@strengejacke I think it's a little weird that the reported point estimates are changing when bootstrap approaches are used. I think most users requesting bootstrap intervals would be expecting the maximum likelihood estimates along with the bootstrap intervals.
The more curious thing here is that the bootstrapped estimates are all the same when type = "parametric"
, see previous answer from me, or the CI, which bounds are the same as the point estimate.
But for skewed data, isn't a bootstrapped estimate more useful? Else, one could just request robust standard errors.
Oh, yeah, that’s not right. Can you link to where you specify the parametric call to boot? You need to also give the ran.gen function (eg, simulate) and mle estimate.
My view is that if you’re going to make a parametric assumption with a single level model, you may as well construct intervals using an analytic method such as pivot or profile likelihood. These methods are tractable for such models and computationally more efficient. I don’t see all that much value in parametric bootstrap in those cases—in those cases, the value of the bootstrap is to avoid parametric assumptions when the sampling distribution isn’t likely to follow those assumptions.
The code is in bootstrap_model()
:
https://github.com/easystats/parameters/blob/75e5a0b784b82c890df99a425140fe20cd1e06f3/R/bootstrap_model.R#L68
My view is that ...
So you would suggest we stick to the default "ordinary"?
I believe that if one sets type="ordinary", the command boot performs a nonparametric bootstrap by default. The parametric bootstrap option in the command boot requires providing a ran.gen function to generate new R pseudovalues of the response variable using the parameters of the originally fitted model, and then refit the model on the R pseudosamples to obtain the bootstrapped sampling distribution for each model parameter, to calculate the p-values and CI. I think that the point of parametric bootstrap in GLM(M) type of models is to have better (less biased, more precise) p.values and CI for moderate-sized data sets for which the asymptotic Wald statistics and quadratic approximations of (multidimensional) likelihood functions do not necessarily apply, or render poor results. Even for most good fitting models, with more than 200(?; I am unsure what the threshold is since it also depends on model complexity) datapoints, we may happily live in asymptotics, but with less data as most scientists typically have, I would trust more the p-values and CI obtained after parametric bootstrap.
This
results <- boot::boot(
data = data,
statistic = boot_function,
R = iterations,
sim = type,
parallel = parallel,
ncpus = n_cpus,
model = model
)
needs to be something like:
f <- function(x, mle) {
out <- insight::get_data(x)
resp <- simulate(x, nsim = 1)
out[[insight::find_response(x)]] <- resp
return(out)
}
results <- boot::boot(
data = data,
statistic = boot_function,
R = iterations,
sim = type,
parallel = parallel,
ncpus = n_cpus,
model = model,
ran.gen = f
)
I would argue that the primary motivation for parametric bootstrap is the same as MCMC or other computational methods--to conduct inference when analytic methods are intractable (e.g., due to model complexity or convergence issues due to shallow likelihood surfaces or parameters near boundaries).
Based on my reading of literatures I'm familiar with, parametric bootstrap intervals generally have similar coverage as profile intervals. When a pivot is available (e.g., for linear regression coefficients), these usually perform better even in small samples than either profile or parametric bootstrap (this is the whole basis on fiducial inference). For a variety of problems, parametric bootstrap can have somewhat or severely erratic performance (e.g., https://www.genetics.org/content/174/1/481; https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.2514; https://www.tandfonline.com/doi/pdf/10.1080/10705511.2011.607072).
Of course, you may disagree, but in my view, the assumption that differences in deviance values are χ2 distributed becomes reasonably correct even at fairly small sample sizes (e.g., N = 20–30; some bootstrap texts jokingly suggest N = 42). In my experiences in social sciences, most analysts adopt a bootstrap approach because they wish to avoid parametric assumptions, rather than due to coverage concerns. So, I think the current default basic bootstrap approach is likely best.
To summarize:
So can we close this issue?
I would say that it is reasonable to close the issue. Many thanks for all your effort.
I think I found a minor sign inconsistency on the overdispersion parameter of the negative binomial models fitted with glmmTMB. The fitted parameter must be positive (0.566) and it comes as negative when bootstrapped. It might be useful to change the label in the bootstrap table from "(Intercept).1" to "overdispersion param." Also, when the fitted model is Zero Truncated Negative Binomial, the overdispersion parameter does not appear when bootstrapped. The latter may not be not a great loss for most users, and I just wonder if for overall consistency it might be better to just delete it from the glm.NB fitted with glmmTMB. The choice and the work is yours, of course.
library(glmmTMB)
glm.NB <- glmmTMB(count ~ mined, family=nbinom2, data=Salamanders)
summary(glm.NB)
# Family: nbinom2 ( log )
# Formula: count ~ mined
# Data: Salamanders
# AIC BIC logLik deviance df.resid
# 1762.3 1775.7 -878.2 1756.3 641
# Overdispersion parameter for nbinom2 family (): **0.566**
#
# Conditional model:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.2192 0.1293 -9.427 <2e-16 ***
# minedno 2.0368 0.1527 13.343 <2e-16 ***
#
bootstrap_parameters(model=glm.NB, iterations=100)
# # Fixed Effects
#
# Parameter | Coefficient | 95% CI | p
# ----------------------------------------------------
# (Intercept) | -1.26 | [-1.54, -0.87] | 0.010
# minedno | 2.07 | [ 1.67, 2.37] | 0.010
# (Intercept).1 | **-0.58** | [-0.79, -0.28] | 0.010
@PabloInchausti Internally, the overdispersion parameter is formulated as negative by glmmTMB:
glm.NB <- glmmTMB::glmmTMB(count ~ mined, family=glmmTMB::nbinom2, data=glmmTMB::Salamanders)
#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you
#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you
#> Warning in Matrix::sparseMatrix(dims = c(0, 0), i = integer(0), j =
#> integer(0), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you
summary(glm.NB)
#> Family: nbinom2 ( log )
#> Formula: count ~ mined
#> Data: glmmTMB::Salamanders
#>
#> AIC BIC logLik deviance df.resid
#> 1762.3 1775.7 -878.2 1756.3 641
#>
#>
#> Overdispersion parameter for nbinom2 family (): 0.566
#>
#> Conditional model:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -1.2192 0.1293 -9.427 <2e-16 ***
#> minedno 2.0368 0.1527 13.343 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glm.NB$fit
#> $par
#> beta beta betad
#> -1.2192401 2.0367623 -0.5696049
#>
#> $objective
#> [1] 878.1677
#>
#> $convergence
#> [1] 0
#>
#> $iterations
#> [1] 11
#>
#> $evaluations
#> function gradient
#> 15 12
#>
#> $message
#> [1] "relative convergence (4)"
#>
#> $parfull
#> beta beta betad
#> -1.2192401 2.0367623 -0.5696049
Created on 2021-06-07 by the reprex package (v2.0.0)
(If you post more code examples, please format them using reprex::reprex()
--it makes it much easier to read.)
The overdispersion parameter is indeed reported as negative in $par and $parfull, but not in the summary seen and interpreted by all users. glmmTMB probable changes the sign given that this parameter in the NB2 parameterization of the NegBinom distribution must be a real positive number, and perhaps your package parameters ought to do likewise.
Specifically, it's applying the inverse link function for the sigma model. For gaussian, that's exp(.5*betad)
. For Gamma, that's exp(-.5*betad)
. For all others, that's exp(betad)
.
These only apply if there is a single betad parameter, not multiple terms in the dispersion model.
@strengejacke After we bootstrap glmmTMB models, we should split out the parameters into cond, zi, and disp lists separately.
In
bootstrap_model
:lme4::bootMer()
defaults totype = "parametric"
.boot::boot()
defaults tosim = "ordinary"
which is non-parametric.Should we be consistent? Or does the diff between the models somehow justify this?
@bwiernik thoughts?