CecileProust-Lima / lcmm

R package lcmm
https://CecileProust-Lima.github.io/lcmm/
55 stars 13 forks source link

maxiter fixed at 100 when using lcmm within gridsearch #73

Closed Lrgnmllr closed 3 years ago

Lrgnmllr commented 3 years ago

I have noticed that when using gridsearch(), the lcmm call within gridsearch() will run for 100 iteration regardless of what is specified through maxiter in the lcmm call. For example:

gridsearch(lcmm(fixed = Yvar ~ fixedvar, mixture = ~ mixturevar, random = ~ randomvar, link = "link", 
                subject = 'id', ng = 5, data = data, maxiter = 1000, returndata = TRUE),
           minit = mod1, maxiter = 60, rep = 30)​

This will perform the gridsearch as requested, but the final lcmm call within will only run for 100 iterations, instead of 1000 as specified by maxiter.

If others have this issue, and untill this is fixed, I have two possible workarounds: 1 - run gridsearch for a higher amount of iterations and repetitions, and hope that one gets close enough to the global maximum

2 - For those with access to somekind of server with plenty of cores, parallelize the lcmm call. Here is a code that works for me (perhaps not the most efficient R programming though :) ):

First, load the lcmm package and the doParallel package

library(lcmm) 
library(doParallel)

Second, run a model with one class. Here the model is called mod1

mod1 <- lcmm(fixed = Yvar ~ fixedvar, random = ~ randomvar, link = "link", 
             subject = 'id', ng = 1, data = data, maxiter = 1000, returndata = TRUE)

Third, parallelization stuff (here I used 15 cores, and it will run twice on each)

cl <- makeCluster(15)

registerDoParallel(cl) # important, otherwise it wil not run on several cores!

# ---- Now, the homemade gridsearch:
DIYgridsearch <- foreach(i = 1:30, 
                         .combine = rbind,
                         .packages = "lcmm", 
                         .export = c("data", "mod1")
) %dopar% {
  .GlobalEnv$mod1 <- mod1 # this, 
  .GlobalEnv$data <- data # and this were necessary on my machine. 

  # The 5 class model
  mod5 <- lcmm(fixed = Yvar ~ fixedvar, mixture = ~ mixturevar, random = ~ randomvar, link = "link", 
               subject = 'id', ng = 5, data = data, maxiter = 60, returndata = TRUE,
               B = random(mod1))

  # Extract the loglik and the estimated parameters, which will then combined into a matrix with rbind above
  c(mod5$loglik, mod5$best) 
}
stopCluster(cl) # important stuff

Fourth, extracting, then using the parameters that gave the best loglik:

bestpar <- DIYgridsearch[which.max(DIYgridsearch[, 1]), -1]
mod5_DIY <- lcmm(fixed = Yvar ~ fixedvar, mixture = ~ mixturevar, random = ~ randomvar, link = "link", 
                 subject = 'id', ng = 5, data = data, maxiter = 1000, returndata = TRUE,
                 B = bestpar)

This should work. However, you may get a Warning after the parallelization stuff is finished, that reads:

Warning message:
In e$fun(obj, substitute(ex), parent.frame(), e$data) :
  already exporting variable(s): data, mod1

I do not know how to avoid this warning, but it is of no consequence for the user :)

I hope this helps whoever encounters the same problem

VivianePhilipps commented 3 years ago

Hello,

thank you for your suggestion. The estimation of the final model uses effectively the default number of iteration, which is 100. If not sufficient, you can you could run again the model with more iterations :


m <- gridsearch(lcmm(fixed = Yvar ~ fixedvar, mixture = ~ mixturevar, random = ~ randomvar, link = "link", 
                                    subject = 'id', ng = 5, data = data, returndata = TRUE),
                           minit = mod1, maxiter = 60, rep = 30)​  # will do 100 iterations

mm <- lcmm(fixed = Yvar ~ fixedvar, mixture = ~ mixturevar, random = ~ randomvar, link = "link", 
                subject = 'id', ng = 5, data = data, maxiter = 1000, returndata = TRUE, B = m$best)

The estimation of mm starts at the final point of model m (because B=m$best is specified), so it adds 1000 more iterations in order to achieve convergence.

Best,

Viviane

Lrgnmllr commented 3 years ago

Ah, what a nice and simple solution!

Thank you for your answer :)

VivianePhilipps commented 3 years ago

Since it is easy to fix, I did the correction. The maxiter in the model call is now used to fit the final model. ;)

Viviane