`glmmTMB` solutions - Githubissues

vincentarelbundock / marginaleffects

R package to compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and ML models. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference

https://marginaleffects.com

Other

440 stars 46 forks source link

`glmmTMB` solutions #1064

Closed vincentarelbundock closed 5 months ago

vincentarelbundock commented 6 months ago

This consolidates many discussions about glmmTMB models in marginaleffects.

TLDR

The standard errors reported by marginaleffects are often much too small.
This problem is critical.
I am considering removing support for glmmTMB altogether.
The error/warning in the next version will be more aggressive.
I have spent a lot of time trying various solutions to this (ex: using newparams= in predict() instead of set_coef() to take the derivatives with respect to model parameters. None of my attempts worked. Unless an expert can supply a solution, I am unlikely to spend more time on this.

This thread consolidates links, info, and problem statements from several threads so I can close the various other issues and keep a record.

Main problem

In many cases, the standard errors for glmmTMB models are much too small. This is dangerous, and I have not been able to find a solution.

Therefore, I strongly recommend against reporting standard errors computed by marginaleffects with glmmTMB objects.

Minimal example:

library(marginaleffects)
library(glmmTMB)
dat <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/causaldata/thornton_hiv.csv")
m <- glmmTMB(got ~ age * hiv2004 + (1 | villnum), data = dat)

predict(m, se.fit = TRUE) |> data.frame() |> head(2)

            fit     se.fit
    1 0.6247427 0.07016562
    2 0.6177011 0.07035182

predictions(m, vcov = insight::get_varcov(m)) |> head(2)

     Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
        0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
        0.618    0.00983 62.9   <0.001 Inf 0.598  0.637

Notes, process, and attempted solutions

Contributors who want to help should be familiar with how standard errors are computed in marginaleffects. See:
- https://marginaleffects.com/vignettes/uncertainty.html
REML=TRUE not supported
- https://github.com/vincentarelbundock/marginaleffects/issues/970
- The most relevant suggestion and explanation is here: https://github.com/vincentarelbundock/marginaleffects/issues/970#issuecomment-1840857440
- I attempted to implement a solution with newparams, but my attempt failed. The standard errors were clearly much too small in models with random effects.
- Useful info on glmmTMB: https://github.com/vincentarelbundock/marginaleffects/issues/810#issuecomment-1737610539
Standard errors are much too small with random effects
- Long discussion: https://github.com/vincentarelbundock/marginaleffects/issues/810
- Reproducible examples: https://github.com/vincentarelbundock/marginaleffects/issues/810#issuecomment-1740118019
- Proposed solution, but I think this only applies to predictions. So I'm not sure how to make this work with the other quantities of interest estimated by marginaleffects. https://github.com/vincentarelbundock/marginaleffects/issues/810#issuecomment-1752137929
- Relevant discussion: https://github.com/glmmTMB/glmmTMB/issues/915

vincentarelbundock commented 6 months ago

library(marginaleffects)
library(glmmTMB)
dat <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/causaldata/thornton_hiv.csv")
m <- glmmTMB(got ~ age * hiv2004 + (1 | villnum), data = dat)

get_se_manual <- function(m, random = FALSE) {
    s <- TMB::sdreport(m$obj, getJointPrecision = TRUE)
    Sigma <- solve(s$jointPrecision)
    if (isTRUE(random)) {
        XZ <- as.matrix(cbind(getME(m, "X"), getME(m, "Z")))
        keepvars <- c("beta", "b")
    } else {
        XZ <- as.matrix(getME(m, "X"))
        keepvars <- "beta"
    }
    keepvars <- rownames(Sigma) %in% keepvars
    Sigma <- Sigma[keepvars, keepvars]
    se <- sqrt(diag(XZ %*% (Sigma %*% t(XZ))))
    return(se)
}

se1 <- get_se_manual(m, random = TRUE)
se2 <- predict(m, se.fit = TRUE)$se.fit

V <- vcov(m)$cond
J <- model.matrix(m)
se3 <- sqrt(diag(M %*% (V %*% t(M))))
se4 <- get_se_manual(m, random = FALSE)

all.equal(unname(se1), unname(se2))
all.equal(unname(se3), unname(se4))

# mismatch
se5 <- predictions(m, vcov=vcov(m))$std.error

bwiernik commented 6 months ago

I'm happy to help diagnose and figure out these issues--these are two of my most-used packages, so I'd love for them to work well together

ASKurz commented 6 months ago

That would be an A+ RStats collaboration, right there.

bbolker commented 6 months ago

I will try to take a look at this shortly. Possibly already known/obvious, but the differences have to do with how the uncertainty in the conditional modes ("BLUPs"), and their covariances with fixed-effect and top-level random-effect (i.e. variance/covariance) parameters, are taken into account. In particular, AFAICT methods that are built on insight::get_varcov() take only the uncertainty in the fixed-effect parameters into account. Depending on the magnitude of the other uncertainties (in conditional modes and RE parameters), and the correlation between the other uncertainties and the FE uncertainties, one may get very different answers for the SEs. I will admit I tended to discount the initial example given in https://github.com/glmmTMB/glmmTMB/issues/915 because it is artificially small (i.e., there are only three levels of the RE grouping variable (Species), which further covary strongly with the FE predictor variable (Sepal.Length), so it's not surprising that the RE/FE covariances would have a big effect on the result. It's interesting that this issue also significantly affects SE estimates for a realistic example.

To my mind, a gold standard for concluding that a particular way of estimating SEs is flat-out wrong is to (1) specify exactly what components of uncertainty you intend to incorporate and (2) show that confidence intervals based on those SEs give incorrect coverage in a simulation example ...

If we can't agree on an appropriate definition it may be possible to add some flexibility to glmmTMB to allow SEs to be computed based on different subsets of the uncertainty, at user request.

Also see a long related thread/discussion in the context of lme4 here ...

strengejacke commented 6 months ago

Not sure if this helps, but I just saw that, no matter what kind of varcov we supply, the result doesn't change - or am I missing something? Might be a precision issue (i.e. vcov values too small anyway?)

library(glmmTMB)
library(marginaleffects)
dat <- datawizard::data_read("https://vincentarelbundock.github.io/Rdatasets/csv/causaldata/thornton_hiv.csv")
dat$villnum <- as.factor(dat$villnum)
m <- glmmTMB(got ~ age * hiv2004 + (1 | villnum), data = dat)
m2 <- lm(got ~ age * hiv2004, data = dat)

ms1 <- matrix(
  c(
    sigma(m), 0, 0, 0,
    0, sigma(m) + 1, 0, 0,
    0, 0, sigma(m) + 2, 0,
    0, 0, 0, sigma(m) + 3
  ),
  4
)

ms2 <- matrix(
  c(
    sigma(m), 0, 0, 0,
    0, sigma(m), 0, 0,
    0, 0, sigma(m), 0,
    0, 0, 0, sigma(m)
  ),
  4
)

ms3 <- matrix(
  c(
    sigma(m), 1, 0, 0,
    2, sigma(m) + 1, 0, 0,
    0, 3, sigma(m) + 2, 4,
    5, 6, 0, sigma(m) + 3
  ),
  4
)

predictions(m, vcov = insight::get_varcov(m)) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
#>     0.618    0.00983 62.9   <0.001 Inf 0.598  0.637
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004, villnum 
#> Type:  response

predictions(m, vcov = insight::get_varcov(m) + ms1) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
#>     0.618    0.00983 62.9   <0.001 Inf 0.598  0.637
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004, villnum 
#> Type:  response

predictions(m, vcov = insight::get_varcov(m) + ms2) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
#>     0.618    0.00983 62.9   <0.001 Inf 0.598  0.637
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004, villnum 
#> Type:  response

predictions(m, vcov = insight::get_varcov(m) + ms3) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
#>     0.618    0.00983 62.9   <0.001 Inf 0.598  0.637
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004, villnum 
#> Type:  response

predictions(m, vcov = insight::get_varcov(m2)) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.625    0.00839 74.5   <0.001 Inf 0.608  0.641
#>     0.618    0.00983 62.9   <0.001 Inf 0.598  0.637
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004, villnum 
#> Type:  response

^{Created on 2024-03-29 with reprex v2.1.0}

strengejacke commented 6 months ago

@bbolker Would it be possible to modify get_varcov.glmmTMB() to return a variance-covariance matrix that takes top-level random-effect varcov into account? If I understand right, that would possibly help.

strengejacke commented 6 months ago

FWIW, for non-mixed models, computation of SEs looks good:

library(glmmTMB)
library(marginaleffects)
dat <- datawizard::data_read("https://vincentarelbundock.github.io/Rdatasets/csv/causaldata/thornton_hiv.csv")
dat$villnum <- as.factor(dat$villnum)
m <- glmmTMB(got ~ age * hiv2004, data = dat)

predict(m, se.fit = TRUE) |>
  data.frame() |>
  head(2)
#>         fit     se.fit
#> 1 0.6712538 0.01132587
#> 2 0.6639950 0.01261952

predictions(m, vcov = insight::get_varcov(m)) |> head(2)
#> 
#>  Estimate Std. Error    z Pr(>|z|)   S 2.5 % 97.5 %
#>     0.671     0.0113 59.3   <0.001 Inf 0.649  0.693
#>     0.664     0.0126 52.6   <0.001 Inf 0.639  0.689
#> 
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, got, age, hiv2004 
#> Type:  response

^{Created on 2024-03-29 with reprex v2.1.0}

vincentarelbundock commented 5 months ago

Thanks all for the encouragement and offers of help.

To get things started, I thought I would supply the most stripped down example possible of the way in which marginaleffects computes standard errors for predictions. As you will see, this works well for a glmmTMB with only fixed effects.

The idea is to create a J matrix where each column holds the (numerical) derivative of the quantity of interest (predictions, slopes, etc.) with respect to one of the coefficients in the model.

In this predictions example, we loop over coefficients, modify each coefficient by a small eps amount, and then make new predictions that we compare to baseline predictions:

library(glmmTMB)
model <- glmmTMB(mpg ~ hp + am, data = mtcars)

J <- matrix(NA, nrow = nrow(mtcars), ncol = length(fixef(model)))

eps <- 1e-5
pred <- predict(model, newdata = mtcars)

for (col in 1:ncol(J)) {
    model_new <- model

    # tweak coefficients
    model_new$fit$par[col] <- model_new$fit$par[col] + eps

    # make new predictions with predicted model
    pred_new <- predict(model_new, newdata = mtcars)

    # forward difference
    J[, col] <- (pred_new - pred) / eps
}

se1 <- sqrt(diag(J %*% vcov(model)$cond %*% t(J)))
se2 <- predict(model, se.fit = TRUE)$se.fit

head(data.frame(se1, se2))

            se1       se2
    1 0.7783791 0.7783791
    2 0.7783791 0.7783791
    3 0.8087538 0.8087538
    4 0.7382497 0.7382497
    5 0.6448505 0.6448505
    6 0.7579769 0.7579769

bbolker commented 5 months ago

It would be interesting to compare the lmer and glmmTMB versions of this, for a model that does have fixed effects. I want to dig around in the code and see what options are being specified to predict.merMod in the analogous code ... I can imagine that there is a hack involving map (i.e., to prevent BLUPs from being re-estimated with the modified parameters) that may help ...

vincentarelbundock commented 5 months ago

For lme4, we can use the same strategy as above by tweaking the entries in the @beta slot one after the other. This allows us to replicate results from predict.merMod with the re.form=NA argument.

library(lme4)
library(glmmTMB)
library(marginaleffects)

l <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)

predict(l, se.fit = TRUE, re.form = NA)$se.fit |> head()
           1        2        3        4        5        6 
    6.824597 6.786929 7.094267 7.705438 8.555575 9.581277 

predictions(l)$std.error |> head()
    [1] 6.824597 6.786929 7.094267 7.705438 8.555575 9.581277

The same J matrix strategy seems to produce SEs that are too small for glmmTMB:

g <- glmmTMB(Reaction ~ Days + (Days | Subject), sleepstudy)

predict(g, se.fit = TRUE, re.form = NA)$se.fit |> head()
    eta_predict eta_predict eta_predict eta_predict eta_predict eta_predict 
       6.632171    6.595551    6.894211    7.488140    8.314301    9.311079 

predictions(g, vcov = vcov(g))$std.error |> head()
    [1] 5.1584464 4.1879661 3.2301607 2.3009761 1.4562492 0.9551621

bbolker commented 5 months ago

As suspected, setting the map element to prevent the b vector from being re-optimized with the altered coefficients can work. Notes:

there is some screwy stuff going on with mutable state of the TMB objects, which needs a little more thought and testing; we must need to copy the fitted b vector from somewhere?
it might be worth a longer conversation about when you use package-provided methods and when you use marginaleffects:
- for responses that are actually linear, the finite-difference calculation done here is an expensive way of re-deriving the model matrix (you probably knew that ...)
passing newparams to predict does work, if we use map appropriately
there's a longer conversation to be had about what the standard errors condition on (the $\sqrt{\mathrm{diag}(X V X^\top)}$ approach disregards uncertainty in the the $b$ and $\theta$ values ...

library(glmmTMB)
library(emulator)

pred_ex <- function(model, data, map_b = TRUE) {
    J <- matrix(NA, nrow = nrow(data), ncol = length(fixef(model)$cond))
    eps <- 1e-5
    pred <- predict(model, newdata = data)

    if (map_b) {
        b_vec <- model$obj$env$parList()$b
        if (length(b_vec)>0) {
            model$modelInfo$map[["b"]] <- factor(rep(NA,length(b_vec)))
        }
    }

    for (col in 1:ncol(J)) {
        model_new <- model

        model
        ## tweak coefficients
        model_new$fit$par[col] <- model_new$fit$par[col] + eps

        ## make new predictions with predicted model
        pred_new <- predict(model_new, newdata = mtcars)

        ## forward difference
        J[, col] <- (pred_new - pred) / eps
    }

    se1 <- sqrt(diag(J %*% vcov(model)$cond %*% t(J)))

    return(se1)
}

pred_ex2 <- function(model, data) {
    J <- matrix(NA, nrow = nrow(data), ncol = length(fixef(model)$cond))
    eps <- 1e-5
    pred <- predict(model, newdata = data)

    b_vec <- model$obj$env$parList()$b
    if (length(b_vec)>0) {
        model$modelInfo$map$b <- factor(rep(NA,length(b_vec)))
    }

    pars <- model$fit$par

    for (col in 1:ncol(J)) {
        pars_new <- pars
        pars_new[col] <- pars_new[col] + eps

        ## make new predictions with predicted model
        pred_new <- predict(model, newdata = mtcars, newparams = pars_new)

        ## forward difference
        J[, col] <- (pred_new - pred) / eps
    }

    se1 <- sqrt(diag(J %*% vcov(model)$cond %*% t(J)))

    return(se1)
}

m1 <- glmmTMB(mpg ~ hp + am + (1|cyl), data = mtcars)
head(pred_ex(m1, mtcars)) ## BAD if run immediately
xx <- predict(m1, se.fit = TRUE)
head(pred_ex(m1, mtcars)) ## WORKS if run after predict(., se.fit = TRUE)
## (b values need to be copied from somewhere?)

X <- model.matrix(m1)
V <- vcov(m1)$cond

m <- cbind(
    ME1 = pred_ex(m1, mtcars),
    ME2 = pred_ex2(m1, mtcars),
    glmmTMB = unname(predict(m1, se.fit = TRUE)$se.fit),
    quadform = sqrt(emulator::quad.tdiag(V, X))
)
head(m)
pairs(m, gap = FALSE)

bbolker commented 5 months ago

Any thoughts on this? Digging in a bit to why this is sensitive to calling predict(m1, se.fit = TRUE), which is a bit of a rabbit hole ...

vincentarelbundock commented 5 months ago

Sorry for the delay. Crazy week.

Thanks @bbolker it looks overwriting $map$b does the trick! See example below.

Using newparams was very easy to implement. Thanks for the suggestion and investigation!

Example with draft PR https://github.com/vincentarelbundock/marginaleffects/pull/1078

remotes::install_github("vincentarelbundock/marginaleffects@issue1064")

library(glmmTMB)
library(marginaleffects)

m <- glmmTMB(mpg ~ hp + am + (1|cyl), data = mtcars)
p1 <- predictions(m, newdata = mtcars, re.form = NA)
p2 <- predict(m, se.fit = TRUE, re.form = NA)

tinytest::expect_equivalent(p1$estimate, p2$fit)

    ----- PASSED      : <-->
     call| tinytest::expect_equivalent(p1$estimate, p2$fit) 

tinytest::expect_equivalent(p1$std.error, p2$se.fit)

    ----- PASSED      : <-->
     call| tinytest::expect_equivalent(p1$std.error, p2$se.fit)

there's a longer conversation to be had about what the standard errors condition on (the $\sqrt{\mathrm{diag}(X V X^\top)}$ approach disregards uncertainty in the the $b$ and $\theta$ values ...

Here you mean that this is equivalent to re.form=NA, correct?

for responses that are actually linear, the finite-difference calculation done here is an expensive way of re-deriving the model matrix (you probably knew that …)

Yes, of course. But the set of quantities where J==model.matrix(model) trivially holds is a very small subset of all the quantities computed by marginaleffects (think: slopes, complex contrasts, average predictions by subgroup in GAMs, etc.). Currently, everything uses the same machinery, which simplifies the code base a lot. But you’re right that there are major performance gains available for a few quantities of interest in linear models.

bbolker commented 5 months ago

Here you mean that this is equivalent to re.form=NA, correct?

Yes. There are lots of "solutions" out there to computing standard errors on predictions, including the examples in the GLMM FAQ, that include only the fixed-effect uncertainty because until recently that was all that was available ... As long as these SEs are explicitly for population-level/re.form=NA predictions/effects, I have no problem with this.

Your last point about computing J makes sense (and is about what I expected). (Specifically in TMB-based models, or other models using automatic differentiation, a wider class of SEs can be computed efficiently via a delta-method approximation using AD gradients ...)

I'm still nervous about the path-dependence of getting correct predictions (i.e. the need to call predict(..., se.fit = TRUE) first in the example above) — arguably that's a problem for glmmTMB not for you.

PS Now that I'm warmed up I could have a hack at making the REML=TRUE case (ideally by making predict.glmmTMB smarter, which is something that should be done anyway ... that is, the map-fixing stuff implemented here should really be the default in glmmTMB)

vincentarelbundock commented 5 months ago

REML would definitely be cool.

What's your view on the level of "alarm" I should raise about re.form=NA? I'm afraid this might get lost in the documentation. Would it be appropriate to raise a warning once per session to inform users that every CI only takes fixed effect uncertainty into account? Users could silence it by using the re.form argument explicitly.

bbolker commented 5 months ago

Hmm. If marginaleffects only/always does population-level prediction, and that's already reasonably clear to users, then I think the X V X^T calculation is proper and appropriate and no warning is needed. If it sometimes does group-level prediction, then at least a once-per-session warning, and possibly a message every time, would be appropriate. (In that case I'm not sure how re.form = NA makes sense as a silencing mechanism ...)

vincentarelbundock commented 5 months ago

Thanks, that's helpful.

We can do group-level predictions with brms, for example. I just don't have a good way to document differences for every single package, so raising a warning may the only way people are going to read this.

The idea is that if a user sets re.form=NA explicitly, then we don't need to warn them about this.

vincentarelbundock commented 5 months ago

Thanks to everyone for your insights, and especially to @bbolker. The current version on Github should be improved.