drizopoulos / GLMMadaptive

GLMMs with adaptive Gaussian quadrature
https://drizopoulos.github.io/GLMMadaptive/
59 stars 14 forks source link

Confusion about the values returned by vcov(MixMod) #49

Closed shafayetShafee closed 1 year ago

shafayetShafee commented 1 year ago

Consider the following reprex from the package article Methods for MixMod Objects

library(GLMMadaptive)

set.seed(1234)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 15 # maximum follow-up time

# we construct a data frame with the design: 
# everyone has a baseline measurement, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
                 time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
                 sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))

# design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)

betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients
D11 <- 0.48 # variance of random intercepts
D22 <- 0.1 # variance of random slopes

# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, ]))
# we simulate binary longitudinal data
DF$y <- rbinom(n * K, 1, plogis(eta_y))

fm <- mixed_model(fixed = y ~ sex * time, random = ~ time | id, data = DF,
                  family = binomial())

Now the estimated variance-covariance matrix of the maximum likelihood estimates of random effects is returned using the vcov() method,

vcov(fm, parm = "var-cov")

#>             D_11        D_12       D_22
#> D_11  0.42942062 -0.09963969  0.5065884
#> D_12 -0.09963969  0.03701847 -0.2117451
#> D_22  0.50658839 -0.21174511  1.3651870

Then that package article says about the returned value of vcov,

The elements of this covariance matrix that correspond to the elements of the covariance matrix of the random effects (i.e., the elements D_xx) are on the log-Cholesky scale.

I am really confused by the above line. What does it mean by The elements of this covariance matrix are on the log-Cholesky scale.?

Please Note that my end goal is to get the standard errors of the estimated random effects that is, $Var(D{11})$, $Var(D{12})$, $Var(D_{22})$. So do I need to apply any transformation on the resulting matrix to get these?

Would you please help me to clear up the confusion and help me to reach my goal? Thanks.

drizopoulos commented 1 year ago

What do you want to do with the standard errors? If you only want confidence intervals, you can get them from the confint() method.

Otherwise, the std. errors are not computed in the original scale - only for the log Cholesky elements.

—— Professor of Biostatistics Erasmus Medical Center Rotterdam The Netherlands


Από: Shafayet Khan Shafee @.> Στάλθηκε: Saturday, August 12, 2023 9:29:50 AM Προς: drizopoulos/GLMMadaptive @.> Κοιν.: Subscribed @.***> Θέμα: [drizopoulos/GLMMadaptive] Confusion about the values returned by vcov(MixMod) (Issue #49)

Waarschuwing: Deze e-mail is afkomstig van buiten de organisatie. Klik niet op links en open geen bijlagen, tenzij u de afzender herkent en weet dat de inhoud veilig is. Caution: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Consider the following reprex from the package article Methods for MixMod Objects https://drizopoulos.github.io/GLMMadaptive/articles/Methods.html,

library(GLMMadaptive)

set.seed(1234) n <- 100 # number of subjects K <- 8 # number of measurements per subject t_max <- 15 # maximum follow-up time

we construct a data frame with the design:

everyone has a baseline measurement, and then measurements at random follow-up times

DF <- data.frame(id = rep(seq_len(n), each = K), time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))), sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))

design matrices for the fixed and random effects

X <- model.matrix(~ sex * time, data = DF) Z <- model.matrix(~ time, data = DF)

betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients D11 <- 0.48 # variance of random intercepts D22 <- 0.1 # variance of random slopes

we simulate random effects

b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))

linear predictor

eta_y <- as.vector(X %% betas + rowSums(Z b[DF$id, ]))

we simulate binary longitudinal data

DF$y <- rbinom(n * K, 1, plogis(eta_y))

fm <- mixed_model(fixed = y ~ sex * time, random = ~ time | id, data = DF, family = binomial())

Now the estimated variance-covariance matrix of the maximum likelihood estimates of random effects is returned using the vcov()https://rdrr.io/r/stats/vcov.html method,

vcov(fm, parm = "var-cov")

> D_11 D_12 D_22

> D_11 0.42942062 -0.09963969 0.5065884

> D_12 -0.09963969 0.03701847 -0.2117451

> D_22 0.50658839 -0.21174511 1.3651870

Then that package article says about the returned value of vcov,

The elements of this covariance matrix that correspond to the elements of the covariance matrix of the random effects (i.e., the elements D_xx) are on the log-Cholesky scale.

I am really confused by the above line. What does it mean by The elements of this covariance matrix are on the log-Cholesky scale.?

Please Note that my end goal is to get the standard errors of the estimated random effects that is, $Var(D{11})$, $Var(D{12})$, $Var(D_{22})$. So do I need to apply any transformation on the resulting matrix to get these?

Would you please help me to clear up the confusion and help me to reach my goal? Thanks.

— Reply to this email directly, view it on GitHubhttps://github.com/drizopoulos/GLMMadaptive/issues/49, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADE7TTZNVXZGCG4VAPLQSPTXU4WG5ANCNFSM6AAAAAA3NYRYJY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

shafayetShafee commented 1 year ago

No I actually need the standard errors in the original scale (A bit of context: my professor uses stata and stata shows standard errors, But I am doing the assignment in R and I need to show him the standard error).

Would you please dictate a way to transform the output (that is on log-cholesky scale) to the original scale?

shafayetShafee commented 1 year ago

Here's a solution to this issue from Cross Validated using delta method to get variance-covariance matrix original scale from the log-cholesky scale.

library("numDeriv")

# estimated covariance matrix of random effects
D <- fm$D

# transform from covariance matrix to entries of cholesky factor with 
# log-transformed main diagonal
D_chol_entries <- GLMMadaptive:::chol_transf(D)

D_chol_to_D <- function(x) {

  # transform from entries of cholesky factor with log-transformed main diagonal
  # to covariance matrix
  D <- GLMMadaptive:::chol_transf(x)

  D[upper.tri(D, diag = TRUE)]
}

J <- jacobian(D_chol_to_D, D_chol_entries)

# estimated covariance matrix of D_chol_entries
V_chol <- vcov(fm, parm = "var-cov")

# estimated covariance matrix of entries of D
V <- J %*% V_chol %*% t(J)

se <- sqrt(diag(V))
drizopoulos commented 1 year ago

Great that you found a way to do this.