strengejacke / sjPlot

sjPlot - Data Visualization for Statistics in Social Science
https://strengejacke.github.io/sjPlot
603 stars 91 forks source link

glitches in glmmTMB methods #818

Open bbolker opened 2 years ago

bbolker commented 2 years ago

See here

MRE for the first two cases:

library(glmmTMB)
library(sjPlot)
set.seed(101)
## rbeta() function parameterized by mean and shape
my_rbeta <- function(n, mu, shape0) {
  rbeta(n, shape1 = mu*shape0, shape2 = (1-mu)*shape0)
}
n <- 100; ng <- 10
dd <- data.frame(x = rnorm(n),
                 f = factor(rep(1:(n/ng), ng)))
dd <- transform(dd,
                y = my_rbeta(n,
                             mu = plogis(-1 + 2*x + rnorm(ng)[f]),
                             shape0 = 5))

m1 <- glmmTMB(y ~ x + (1|f), family = "beta_family", dd)
tab_model(m1)

Screenshot from 2022-04-19 17-13-24

strengejacke commented 2 years ago

sjPlot uses parameters::model_parameters() to extract coefficients. That included a related bug, which I fixed:

library(glmmTMB)
library(parameters)

set.seed(101)
## rbeta() function parameterized by mean and shape
my_rbeta <- function(n, mu, shape0) {
  rbeta(n, shape1 = mu * shape0, shape2 = (1 - mu) * shape0)
}
n <- 100
ng <- 10
dd <- data.frame(x = rnorm(n), f = factor(rep(1:(n / ng), ng)))
dd <- transform(dd, y = my_rbeta(n, mu = plogis(-1 + 2 * x + rnorm(ng)[f]), shape0 = 5))

m1 <- glmmTMB(y ~ x + (1|f), family = "beta_family", dd)

model_parameters(m1, exponentiate = TRUE)
#> # Fixed Effects
#> 
#> Parameter   | Coefficient |   SE |       95% CI |     z |      p
#> ----------------------------------------------------------------
#> (Intercept) |        0.49 | 0.18 | [0.24, 1.02] | -1.89 | 0.058 
#> x           |        6.76 | 0.88 | [5.24, 8.72] | 14.71 | < .001
#> 
#> # Random Effects
#> 
#> Parameter         | Coefficient |       95% CI
#> ----------------------------------------------
#> SD (Intercept: f) |        1.15 | [0.71, 1.84]
#> SD (Residual)     |        5.56 | [4.07, 7.61]
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed using a
#>   Wald z-distribution approximation.

model_parameters(m1, exponentiate = FALSE)
#> # Fixed Effects
#> 
#> Parameter   | Coefficient |   SE |        95% CI |     z |      p
#> -----------------------------------------------------------------
#> (Intercept) |       -0.71 | 0.37 | [-1.44, 0.02] | -1.89 | 0.058 
#> x           |        1.91 | 0.13 | [ 1.66, 2.17] | 14.71 | < .001
#> 
#> # Random Effects
#> 
#> Parameter         | Coefficient |       95% CI
#> ----------------------------------------------
#> SD (Intercept: f) |        1.15 | [0.71, 1.84]
#> SD (Residual)     |        5.56 | [4.07, 7.61]
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed using a
#>   Wald z-distribution approximation.

However, the creation of the HTML table in tab_model() works a bit different, so the initial issues you posted here are not resolved yet.

A bit confusing is that in glmmTMB(), sometimes sigma() refers to the dispersion parameter and sometimes fixef()$disp - and these are not necessarily identical (I think they differ when the model has a dispersion-formula as well), So what we did in model_parameters() is to check whether sigma and fixef()$disp are identical, and if so, remove dispersion. Else, if not identical, the models seems to have a dispersion component, not only a dispersion parameter - and then the additional model component is shown in the print(). I'm not quite sure why this is not happening here with tab_model(), as the data frame returned by model_parameters() does not contain the row with the dispersion parameter. Will look into it.

bbolker commented 6 months ago

Coming back to this after someone upvoted the question that provoked it in the first place. Is there anything else that can be done on the glmmTMB side?

strengejacke commented 6 months ago

What would be an appropriate name for the dispersion parameters when the dispformula is not specified? Currently, it's (Intercept).

library(glmmTMB)
library(sjPlot)
#> Learn more about sjPlot with 'browseVignettes("sjPlot")'.
set.seed(101)
## rbeta() function parameterized by mean and shape
my_rbeta <- function(n, mu, shape0) {
  rbeta(n, shape1 = mu * shape0, shape2 = (1 - mu) * shape0)
}
n <- 100
ng <- 10
dd <- data.frame(
  x = rnorm(n),
  f = factor(rep(1:(n / ng), ng))
)
dd <- transform(dd,
  y = my_rbeta(n,
    mu = plogis(-1 + 2 * x + rnorm(ng)[f]),
    shape0 = 5
  )
)

m1 <- glmmTMB(y ~ x + (1 | f), family = "beta_family", dd)
m2 <- glmmTMB(y ~ x + (1 | f), dispformula = ~x, family = "beta_family", dd)

parameters::model_parameters(m1)
#> # Fixed Effects
#> 
#> Parameter   | Coefficient |   SE |        95% CI |     z |      p
#> -----------------------------------------------------------------
#> (Intercept) |       -0.71 | 0.37 | [-1.44, 0.02] | -1.89 | 0.058 
#> x           |        1.91 | 0.13 | [ 1.66, 2.17] | 14.71 | < .001
#> 
#> # Dispersion
#> 
#> Parameter   | Coefficient |       95% CI
#> ----------------------------------------
#> (Intercept) |        5.56 | [4.07, 7.61]
#> 
#> # Random Effects Variances
#> 
#> Parameter         | Coefficient |       95% CI
#> ----------------------------------------------
#> SD (Intercept: f) |        1.15 | [0.71, 1.84]
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald z-distribution approximation.
#> 
#> The model has a log- or logit-link. Consider using `exponentiate =
#>   TRUE` to interpret coefficients as ratios.

parameters::model_parameters(m2)
#> # Fixed Effects
#> 
#> Parameter   | Coefficient |   SE |        95% CI |     z |      p
#> -----------------------------------------------------------------
#> (Intercept) |       -0.68 | 0.38 | [-1.42, 0.06] | -1.80 | 0.071 
#> x           |        1.88 | 0.14 | [ 1.61, 2.15] | 13.75 | < .001
#> 
#> # Dispersion
#> 
#> Parameter   | Coefficient |   SE |        95% CI |     z |      p
#> -----------------------------------------------------------------
#> (Intercept) |        1.71 | 0.16 | [ 1.39, 2.02] | 10.63 | < .001
#> x           |        0.11 | 0.14 | [-0.16, 0.38] |  0.77 | 0.444 
#> 
#> # Random Effects Variances
#> 
#> Parameter         | Coefficient |       95% CI
#> ----------------------------------------------
#> SD (Intercept: f) |        1.15 | [0.72, 1.84]
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald z-distribution approximation.

summary(m1)
#>  Family: beta  ( logit )
#> Formula:          y ~ x + (1 | f)
#> Data: dd
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   -429.5   -419.1    218.7   -437.5       96 
#> 
#> Random effects:
#> 
#> Conditional model:
#>  Groups Name        Variance Std.Dev.
#>  f      (Intercept) 1.312    1.145   
#> Number of obs: 100, groups:  f, 10
#> 
#> Dispersion parameter for beta family (): 5.56 
#> 
#> Conditional model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -0.7078     0.3735  -1.895   0.0581 .  
#> x             1.9108     0.1299  14.707   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(m2)
#>  Family: beta  ( logit )
#> Formula:          y ~ x + (1 | f)
#> Dispersion:         ~x
#> Data: dd
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   -428.1   -415.0    219.0   -438.1       95 
#> 
#> Random effects:
#> 
#> Conditional model:
#>  Groups Name        Variance Std.Dev.
#>  f      (Intercept) 1.316    1.147   
#> Number of obs: 100, groups:  f, 10
#> 
#> Conditional model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -0.6782     0.3762  -1.803   0.0714 .  
#> x             1.8794     0.1367  13.753   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Dispersion model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   1.7095     0.1608  10.633   <2e-16 ***
#> x             0.1054     0.1376   0.766    0.444    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Created on 2024-01-08 with reprex v2.0.2

bbolker commented 6 months ago

"(Intercept)" seems fine to me ... ??