Closed moosterwegel closed 8 months ago
I cannot reproduce this. The following works for me:
set.seed(1234)
n <- 200 # number of subjects
K <- 12 # number of measurements per subject
t_max <- 14 # maximum follow-up time
# We construct a data frame with the design:
# Everyone has a baseline measurement, and then measurements at random
# follow-up times up to t_max
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))
# Design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)
betas <- c(-2.2, -0.25, 0.24, -0.05) # fixed effects coefficients
sigma <- 0.5 # errors' standard deviation
D11 <- 1.0 # variance of random intercepts
D22 <- 0.5 # variance of random slopes
# We simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, ]))
# We simulate normal longitudinal data
DF$y <- rnorm(n * K, mean = eta_y, sd = sigma)
# We assume that values below -4 are not observed and set equal to -4
DF$ind <- as.numeric(DF$y < -4)
DF$y <- pmax(DF$y, -4)
# Fit the model
fm <- mixed_model(cbind(y, ind) ~ sex * time, random = ~ time | id, data = DF,
family = censored.normal())
summary(fm)
# Calculate predictions for a subject
newDF <- DF[DF$id == 1, ]
predict(fm, type = 'subject_specific', newdata = newDF)
# predictions at a future time point
newDF2 <- newDF[nrow(newDF), ]
newDF2$time <- 13.5
predict(fm, type = 'subject_specific', newdata = newDF, newdata2 = newDF2)
Thanks for taking a look. After diving into it a bit deeper I think it's unrelated to the Hessian, but instead caused by imbalanced data were some subjects only have one observation. Below (based on your code) reproduces the issue for me:
set.seed(1234)
n <- 200 # number of subjects
K <- 12 # number of measurements per subject
t_max <- 14 # maximum follow-up time
# We construct a data frame with the design:
# Everyone has a baseline measurement, and then measurements at random
# follow-up times up to t_max
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))
# Design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)
betas <- c(-2.2, -0.25, 0.24, -0.05) # fixed effects coefficients
sigma <- 0.5 # errors' standard deviation
D11 <- 1.0 # variance of random intercepts
D22 <- 0.5 # variance of random slopes
# We simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF$id, ]))
# We simulate normal longitudinal data
DF$y <- rnorm(n * K, mean = eta_y, sd = sigma)
# We assume that values below -4 are not observed and set equal to -4
DF$ind <- as.numeric(DF$y < -4)
DF$y <- pmax(DF$y, -4)
# sample such that we get imbalanced data (i.e. not same nr of observations per id)
DF <- DF[sample(nrow(DF), round(n*0.7)), ]
# Fit the model
fm <- mixed_model(cbind(y, ind) ~ sex * time, random = ~ time | id, data = DF,
family = censored.normal())
summary(fm)
# predict on same data, all subjects, doesn't work with the imbalanced data
predict(fm, type = 'subject_specific', newdata = DF)
# yields Error in y[, 2L] : incorrect number of dimensions
# this yields the same error if an id is selected with only one sample
# Calculate predictions for a subject
newDF <- DF[DF$id == sample(unique(DF$id), 1), ]
predict(fm, type = 'subject_specific', newdata = newDF)
# this yields the same error if an id is selected with only one sample
# predictions at a future time point
newDF2 <- newDF[nrow(newDF), ]
newDF2$time <- 13.5
predict(fm, type = 'subject_specific', newdata = newDF, newdata2 = newDF2)
I'm running version 0.9.1 of GLMMadaptive on R 4.3.1.
It should work in the new version, which is on GitHub now.
Hi Dimitris,
First of all thank you for this great package. I think it offers the best trade-off between speed, flexibility and ease of use for fitting multilevel left censored models. Much appreciated.
I ran in a problem today though when I wanted to predict values from a model with a not positive definite Hessian matrix. I got the following warning:
And when trying to use predict on the same data
I got
The fixed effect prediction (
predict(m, type = 'mean_subject')
works just fine.Can this be solved or is this intended behaviour? My own hacked together solution that just uses ranef() and fixef() for predictions works just fine FWIW.
Let me know if I need to create a repex with a model that does not converge.
Thank you!