drizopoulos / GLMMadaptive

GLMMs with adaptive Gaussian quadrature
https://drizopoulos.github.io/GLMMadaptive/
60 stars 14 forks source link

predict() fails for models with random-slope-intercept? #5

Closed strengejacke closed 5 years ago

strengejacke commented 5 years ago

I'm not sure whether the random slope, or the model fit itself is the cause of this error?

library(GLMMadaptive)

set.seed(1234)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 15 # maximum follow-up time

# we constuct a data frame with the design: 
# everyone has a baseline measurment, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
                 time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
                 sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))

# design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)

betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients
D11 <- 0.48 # variance of random intercepts
D22 <- 0.1 # variance of random slopes

# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- drop(X %*% betas + rowSums(Z * b[DF$id, ]))
# we simulate binary longitudinal data
DF$y <- rbinom(n * K, 1, plogis(eta_y))

fm2 <- mixed_model(fixed = y ~ sex * time, random = ~ time || id, data = DF,
                   family = binomial())

mt <- mean(DF$time)
sdt <- sd(DF$time)

nd <- data.frame(
  y = mean(DF$y),
  sex = c(rep("male", 3), rep("female", 3)),
  time = rep(c(mt - sdt, mt, mt + sdt), 2),
  id = NA
)

predict(fm2, newdata = nd, type = "subject_specific", type_pred = "link", se.fit = TRUE)
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Warning in dbinom(y, 1, mu_y, TRUE): non-integer x = 0.353750
#> Error in optim(par = b_i, fn = log_post_b, gr = score_log_post_b, method = "BFGS", : initial value in 'vmmin' is not finite

Created on 2019-01-17 by the reprex package (v0.2.1)

drizopoulos commented 5 years ago

It is not clear what predictions are you exactly looking for, but the id variable should be set. In addition, you fit a model with a binary outcome, but the y you provide in nd is not a 0 or a 1 but the proportion of ones in DF. This is why it complains for non-integer values in `dbinom().

When you select subject-specific predictions, predict() will estimate the random effects for each subject in nd given their observed outcome measurements y, and based on these estimated random effects and the other covariates in nd, it will produce predictions.

strengejacke commented 5 years ago

Ah, ok. y needs to be 0 or 1, than it works (id can still be NA though).