easystats / modelbased

:chart_with_upwards_trend: Estimate effects, contrasts and means based on statistical models
https://easystats.github.io/modelbased/
GNU General Public License v3.0
234 stars 19 forks source link

Support for glmmTMB count component conditioned on the zi-component #70

Open strengejacke opened 4 years ago

strengejacke commented 4 years ago

One thing that is rather difficult is the "overal" estimated means from a model, i.e. the count component conditioned on the zi-component:

library(glmmTMB)
library(modelbased)
library(ggeffects)

model <- glmmTMB(
  count ~ mined + (1 | site),
  ziformula = ~mined,
  family = poisson,
  data = Salamanders
)

# count model
ggpredict(model, "mined")
#> 
#> # Predicted counts of count
#> # x = mined
#> 
#> x   | Predicted |   SE |       95% CI
#> -------------------------------------
#> yes |      1.09 | 0.23 | [0.69, 1.72]
#> no  |      3.42 | 0.09 | [2.86, 4.09]
#> 
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).
estimate_means(model)
#> mined | rate |   SE |       95% CI
#> ----------------------------------
#> no    | 3.42 | 0.31 | [2.86, 4.09]
#> yes   | 1.09 | 0.25 | [0.69, 1.72]

# zero-inflated model
ggpredict(model, "mined", type = "zi_prob")
#> 
#> # Predicted zero-inflation probabilities of count
#> # x = mined
#> 
#> x   | Predicted |   SE |       95% CI
#> -------------------------------------
#> yes |      0.76 | 0.24 | [0.66, 0.83]
#> no  |      0.36 | 0.12 | [0.30, 0.41]
#> 
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).
estimate_means(model, component = "zi")
#> mined | Mean |   SE |       95% CI
#> ----------------------------------
#> no    | 0.36 | 0.03 | [0.30, 0.41]
#> yes   | 0.76 | 0.04 | [0.66, 0.83]

# model conditioning both on count- and zero-inflated model
ggpredict(model, "mined", type = "zero_inflated")
#> 
#> # Predicted counts of count
#> # x = mined
#> 
#> x   | Predicted |   SE |       95% CI
#> -------------------------------------
#> yes |      0.26 | 0.08 | [0.11, 0.42]
#> no  |      2.21 | 0.22 | [1.75, 2.66]
#> 
#> Adjusted for:
#> * site = NA (population-level)
#> Standard errors are on link-scale (untransformed).

Created on 2020-06-22 by the reprex package (v0.3.0)

If model is of class glmmTMB, hurdle, zeroinfl or zerotrunc, simulations from a multivariate normal distribution (see ?MASS::mvrnorm) are drawn to calculate mu*(1-p). Confidence intervals are then based on quantiles of these results. For type = "re.zi", prediction intervals also take the uncertainty in the random-effect paramters into account (see also Brooks et al. 2017, pp.391-392 for details).

Brooks ME, Kristensen K, Benthem KJ van, Magnusson A, Berg CW, Nielsen A, et al. glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal. 2017;9: 378-400.

Originally posted by @strengejacke in https://github.com/easystats/modelbased/issues/66#issuecomment-647532231

DominiqueMakowski commented 4 years ago

What would be necessary to support it correctly? Does emmeans deals with it alright?

strengejacke commented 4 years ago

No, I don't think this is supported by emmeans yet (resp. glmmTMB, which implements emmeans functionality).

strengejacke commented 4 years ago

It's not that hard, I'd say...

1) simulate predictions: prdat.sim <- .simulate_predictions(model, newdata, nsim, terms, value_adjustment, condition) (https://github.com/strengejacke/ggeffects/blob/f5f3056103940ecb53b8a44496e1804b29de7d39/R/predict_zero_inflation.R#L161) 2) calculate predictions for "overall" model: sims <- exp(prdat.sim$cond) * (1 - stats::plogis(prdat.sim$zi)) 3) compute quantiles (SD, 95% quantiles for CI...): prediction_data <- .join_simulations(data_grid, newdata, prdat, sims, ci, cleaned_terms) (https://github.com/strengejacke/ggeffects/blob/f5f3056103940ecb53b8a44496e1804b29de7d39/R/predict_zero_inflation.R#L2).

The code I used was "evolving" over time, so could possibly be more clean, but it is stable and works with edge cases ;-)

strengejacke commented 4 years ago

What you see under 2), is what you already get. However, the CIs/SEs are wrong, that's why all the simulation stuff is done here.

DominiqueMakowski commented 3 years ago

is this issue still valid? Does it have anything to do with get_predicted?

strengejacke commented 3 years ago

Yes, if you want to have the "correct" CI, we still haven't addressed this, I think.