easystats / insight

:crystal_ball: Easy access to model information for various model objects
https://easystats.github.io/insight/
GNU General Public License v3.0
379 stars 38 forks source link

`get_predicted` confidence intervals in mixed-effects models #677

Open vincentarelbundock opened 1 year ago

vincentarelbundock commented 1 year ago

At the moment, confidence intervals around predicted values for lmerMod only take into account the fixed effects.

For example, in a model with Chick-level random intercepts and coefficients, each Chick will have intervals of the exact same width at every time point:

library(lme4)
library(insight)

fit1 <- lmer(
  weight ~ 1 + Time + (1 + Time | Chick),
  data = ChickWeight)

get_predicted(fit1, ci = .95) |> 
    cbind(ChickWeight) |>
    within({CI_width = CI_high - CI_low}) |>
    subset(Time == 4, select = "CI_width") |>
    table()
#> CI_width
#>  4.1810057846375 4.18100578463751 
#>                5               44

I was chatting with @ASKurz and we were wondering if this is what most users would expect/want. In particular, it seems like many would want the random components to be accounted for in the computation. Should we supply CIs at all, issue a warning, or is the status quo just fine?

Curious what everyone thinks (and maybe especially @bwiernik )

ASKurz commented 1 year ago

Speaking for myself, the current behavior is not what I would have hoped for and I like the options Vincent suggested.

bwiernik commented 1 year ago

Definitely would prefer for the uncertainty due to random effects to be included. But I think this might be a limitation of lme4? I think maybe if we did bootstrap intervals it might work?

bwiernik commented 1 year ago

Because the clever linear algebra tricks that lme4 uses, I don't think it's feasible to get correlations between fix effect and random effect estimates in this package. (They could be obtained in glmmTMB but currently aren't provided because they are computationally expensive).

We should probably add a message saying that these default intervals don't include uncertainty due to random effects. And then we can recommend using ci_method = "boot" as an alternative?

ASKurz commented 1 year ago

How computationally expensive are we talking in the glmmTMB case?

strengejacke commented 1 year ago

One possible way is following Ben's suggestion of adding sigma/random effect variances before computing SE for predictions:

https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#predictions-andor-confidence-or-prediction-intervals-on-predictions

strengejacke commented 1 year ago

Or we look at merTools::predictInterval(), but I think it's limited to few families only?

vincentarelbundock commented 1 year ago

Re: bootstrap

I played around with some ideas and didn't come out reassured. My guess is that a good handling of the nested structure of the data will require something fancy. We probably don't want to give the user the impression that a plain-vanilla approach will give great results by giving them a ci_method.

Re: Bolker

He sounds lukewarm about all of these solutions, and ends up recommending Bayes. Seems like a lot of work to implement for a suboptimal result... might be best to just remove CIs or issue a warning.

Re: predictInterval

Those are prediction intervals, not confidence, right?

bwiernik commented 1 year ago

@ASKurz see https://github.com/glmmTMB/glmmTMB/issues/691

strengejacke commented 1 year ago

Re: bootstrap

I played around with some ideas and didn't come out reassured. My guess is that a good handling of the nested structure of the data will require something fancy. We probably don't want to give the user the impression that a plain-vanilla approach will give great results by giving them a ci_method.

I think lme4::bootMer() takes RE structures into account.

Re: Bolker

He sounds lukewarm about all of these solutions, and ends up recommending Bayes. Seems like a lot of work to implement for a suboptimal result... might be best to just remove CIs or issue a warning.

Re: predictInterval

Those are prediction intervals, not confidence, right?

Ok, you were thinking about some bias adjustment?

mattansb commented 1 year ago

I think lme4::bootMer() takes RE structures into account.

image

use.u = FALSE, type = "parameteric" are the default.

bwiernik commented 1 year ago

Yep, you're right. lme4::bootMer() should give what we are looking for

bbolker commented 9 months ago

See https://github.com/lme4/lme4/issues/739