drizopoulos / GLMMadaptive

GLMMs with adaptive Gaussian quadrature
https://drizopoulos.github.io/GLMMadaptive/
60 stars 14 forks source link

post_vars Matrix Names #54

Closed olbeck closed 5 months ago

olbeck commented 5 months ago

I am running a zero-inflated mixed model where I am interested in the estimated covariance matrix of the random effects as well as the posterior variances/covariances of the random effects for both the zeros and the counts. However, the items inside the post_vars and D elements from the model output are unnamed. I have made a reproducible example that closely follows the example on the GLMMadaptive Article here.

In this example, I assign letters to be the random effects for each subject (not necessarily in alphabetical order).

library(GLMMadaptive)
set.seed(1234)
n <- 10 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 5 # maximum follow-up time

# assign letters to subjects 
subject.letters <- sample(LETTERS, n, replace = F)
# subject.letters
# > [1] "P" "V" "E" "L" "O" "I" "X" "F" "Z" "D"

# we construct a data frame with the design: 
# everyone has a baseline measurement, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
                 letter = rep(subject.letters, each = K),
                 time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
                 sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))

# design matrices for the fixed and random effects non-zero part
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ 1, data = DF)
# design matrices for the fixed and random effects zero part
X_zi <- model.matrix(~ sex, data = DF)
Z_zi <- model.matrix(~ 1, data = DF)

betas <- c(1.5, 0.05, 0.05, -0.03) # fixed effects coefficients non-zero part
shape <- 2 # shape/size parameter of the negative binomial distribution
gammas <- c(-1.5, 0.5) # fixed effects coefficients zero part
D11 <- 0.5 # variance of random intercepts non-zero part
D22 <- 0.4 # variance of random intercepts zero part

# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor non-zero part
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, 1, drop = FALSE]))
# linear predictor zero part
eta_zi <- as.vector(X_zi %*% gammas + rowSums(Z_zi * b[DF$id, 2, drop = FALSE]))
# we simulate negative binomial longitudinal data
DF$y <- rnbinom(n * K, size = shape, mu = exp(eta_y))
# we set the extra zeros
DF$y[as.logical(rbinom(n * K, size = 1, prob = plogis(eta_zi)))] <- 0

fm1 <- mixed_model(fixed = y ~ sex * time, 
                   random = ~ 1 | letter, 
                   zi_fixed = ~ sex ,
                   zi_random = ~ 1 | letter,
                   data = DF,
                   family = zi.poisson())

# model output
fm1

# empirical Bayes estimates of the random effects
fm1$post_modes

# posterior variances of the random effects
fm1$post_vars

# estimated covariance matrix of the random effects
fm1$D

With the following output for fm1$post_modes;

> fm1$post_modes
  (Intercept) zi_(Intercept)
D  0.07615018     1.39299475
E  0.30849425     0.32795906
F  0.09570052     0.52672233
I -0.03957913     0.06637528
L -0.06851107     0.72911426
O -0.04993933    -0.18837267
P -0.02981896     0.31556031
V -0.07792803    -0.23217774
X -0.08021746    -0.66122563
Z -0.03100355    -1.33117900

Output for fm1$post_vars:

> fm1$post_vars
$`1:10`
               (Intercept) zi_(Intercept)
(Intercept)     0.03686052      0.0322533
zi_(Intercept)  0.03225330      0.5452850

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.01896314     0.02129061
zi_(Intercept)  0.02129061     0.58522565

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02672958     0.01769086
zi_(Intercept)  0.01769086     0.39954507

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02500208     0.01649307
zi_(Intercept)  0.01649307     0.36862519

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02885761     0.05326237
zi_(Intercept)  0.05326237     0.70739531

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02314635     0.04003159
zi_(Intercept)  0.04003159     0.82575753

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02573534     0.04874942
zi_(Intercept)  0.04874942     0.77908042

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02330782     0.04109256
zi_(Intercept)  0.04109256     0.84424244

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)     0.02153340     0.01530533
zi_(Intercept)  0.01530533     0.37018834

$<NA>
               (Intercept) zi_(Intercept)
(Intercept)      0.0172394      0.0117271
zi_(Intercept)   0.0117271      0.4073225

And output for fm1$D:

              (Intercept) zi_(Intercept)
(Intercept)     0.03844453     0.06686024
zi_(Intercept)  0.06686024     1.18660506
attr(,"L")
[1] 0.1960728 0.3409971 1.0345656

In this model the random effects names are "P", "V" , "E", "L", "O", "I" , "X", "F", "Z", "D" which are reflected in the fm1$post_modes in alphabetical order. However, the fm1$post_vars has ten matrices with 1:10 as the name for the first matrix and NA for the remainder matrix names. In fm1$post_vars I do not know which matrix refers to which random effect. Is it in order or appearance in DF? Or is it alphabetical?

Additionally, in the output of fm1$D, what do the values of,

attr(,"L")
[1] 0.1960728 0.3409971 1.0345656

refer to? I can calculate that sqrt(0.03844453) = 0.1960728 and sqrt(1.18660506) = 1.0345656 which makes me assume the first and third term and the standard deviations of the count and zeros random effects, respectively. But what is the 0.3409971 value? Is this the estimated correlation coefficient between the counts and zeros random effects? If this is the case, why is the 0.06686024 on the off diagonals of fm1$D since 0.1960728 * 0.3409971 = 0.06686024. Using the reference from the GLMMadaptive Basics page here, I had assumed the model's covariance structure of the random effects would be

$$\begin{bmatrix} \sigma^2{i, counts } & \rho \sigma{i, counts } \sigma{i, zeros } \ \rho \sigma{i, counts } \sigma{i, zeros } & \sigma^2{i, zeros } \end{bmatrix}$$

Is this a correct assumption for the structure of the covariance when using the random and zi_random arguments?

Any clarification on the naming inside the fm1$post_vars and fm1$D outputs would be very helpful! Thank you!

drizopoulos commented 5 months ago

The fm1$post_vars contains the posterior variance-covariance matrices in the same order as fm1$post_modes. I will assign the correct names to it to make it clear. Thanks for reporting.

The attr(, "L") in fm1$D denotes the elements of the Choleski factor of the prior covariance matrix of the random effects.