openpharma / mmrm

Mixed Models for Repeated Measures (MMRM) in R.
https://openpharma.github.io/mmrm/
Other
101 stars 17 forks source link

mmrm does not converge without specifying optimizer nlminb #410

Closed tobiasmuetze closed 5 months ago

tobiasmuetze commented 5 months ago

Summary

The help for mmrm() states: "When optimizer is not set, first the default optimizer (L-BFGS-B) is used to fit the model. If that converges, this is returned. If not, the other available optimizers from h_get_optimizers(), including BFGS, CG and nlminb are tried." However, I am experiencing the case where if no optimizer is specified, the convergence status of 1 is returned, but when the optimizer nlminb is specified, the convergence status is 0. It is unclear how this behavior fits to what is stated in the help.

Minimal example

A data set is simulated that has 300 subjects in the treatment group and 150 subjects in the control group. Subjects have one baseline and 12 post-baseline visits. The control group mean is set to zero for all visits. The treatment group mean is set to 0 for the baseline visit and 0.4 for the post-baseline visits. The standard deviation is set to 1 and the correlation matrix of a subject’s outcome across visit has an AR(1) structure There is no missing data.

The data set includes one row per subject and per post-baseline visit.

library(tidyverse)
library(mmrm)

# Define number of visits, sample sizes, and means
nVisit <- 13
nPla <- 150
nTrt <- 300
plaMean <- rep(0, times = nVisit)
trtMean <- c(0, rep(0.4, times = nVisit-1))

# Correlation matrix AR(1) 
rho      <-  0.9
visits <- 1:nVisit
exponent <- abs(matrix(visits, nrow = nVisit, ncol = nVisit, byrow = TRUE) - visits)
corr_mat <- rho^exponent

# Set seed to reproduce the issue
# Issue also occurs for other seeds, e.g., 10, 11, 12, 17, 18, 31
set.seed(3)

# Simulate control group data
plaSimMat <- mvtnorm::rmvnorm(n = nPla, mean = plaMean, sigma = corr_mat) 
colnames(plaSimMat) <- paste0("V", visits)
plaSimTib <- as_tibble(plaSimMat, .name_repair = 'check_unique') %>%
  mutate(group="pla")

# Simulate treatment group data
trtSimMat <- mvtnorm::rmvnorm(n = nTrt, mean = trtMean, sigma = corr_mat) 
colnames(trtSimMat) <- paste0("V", visits)
trtSimTib <- as_tibble(trtSimMat, .name_repair = 'check_unique') %>%
  mutate(group="trt")

# Create tibble with single row per subject and visit
simTib <- bind_rows(plaSimTib, trtSimTib) %>% 
  mutate(id = as_factor(paste0("id", 1:n()))) %>% 
  pivot_longer(cols = starts_with("V"), 
               names_to = "visit", 
               values_to = "obs") %>% 
  group_by(id) %>% 
  mutate(obsBl = obs[visit=='V1']) %>% 
  filter(visit!="V1") %>% 
  mutate(obs = round(obs, 3),
         obsBl = round(obsBl, 3), 
         visit = as_factor(visit))

# MMRM model that runs into error 
fit1 <- mmrm::mmrm(
  formula = obs ~ group + obsBl*visit + group*visit + us(visit|id),
  data = simTib,
  reml = TRUE)
fit1$opt_details

# MMRM model that results in relative convergence
fit2 <- mmrm::mmrm(
  formula = obs ~ group + obsBl*visit + group*visit + us(visit|id),
  data = simTib,
  control = mmrm_control(optimizers = mmrm:::h_get_optimizers(optimizer="nlminb")),
  reml = TRUE)
fit2$opt_details

Session info

clarkliming commented 5 months ago

hi @tobiasmuetze this issue should be fixed now, previously we report the fit incorrectly (we did not report the best successful one but some random one) and lead to your observed issue. now should be fine

gowerc commented 5 months ago

@clarkliming - Does this only affect the error code or does it affect the coefficients as well ? We have been seeing issues with rbmi where mmrm occasionally returns different coefficients and I am wondering if that is related to this ?

danielinteractive commented 5 months ago

@gowerc It also affects coefficients. But as I understand this behavior should still have been deterministic

clarkliming commented 5 months ago

@clarkliming - Does this only affect the error code or does it affect the coefficients as well ? We have been seeing issues with rbmi where mmrm occasionally returns different coefficients and I am wondering if that is related to this ?

yes @gowerc the coefficients got changed, and also all other things, vcov, theta, likelihood, etc. the fit object got changed. the termrandom is not correct but it previously obtain the index among the successful fits and use that index to obtain the fit among all fits. However, if you see unstable coefficients on the same data, best if it can be reproduced and we can then investigate. if under rbmi, then might be true that we have some edge cases in bootstrapping?