CecileProust-Lima / lcmm

R package lcmm
https://CecileProust-Lima.github.io/lcmm/
58 stars 13 forks source link

multlcmm "object 'c_loglikmultlcmm' not found"' #251

Closed chrisraynerr closed 3 months ago

chrisraynerr commented 6 months ago

Hi Viviane,

When I try to use multiple cores (nproc) to run multlcmm, I get the error:

Error in {: task1 failed - "object 'c_loglikmultlcmm' not found"'

I looked for documentation on c_loglikmultlcmm, but can't find anything. Do you have a solution?

Many thanks, Chris

SPINFALLING commented 6 months ago

I had the same problem when i run multiple cores, and i thinks it may related the List delivery problem, you can't deliver list objects like object[[1]]

VivianePhilipps commented 6 months ago

Hi,

do you get the error within a gridsearch, or with a single multlcmm model? The c_loglikmultlcmm function is the Fortran code that is called in the multlcmm function. I've never had such troubles in the parallel mode. In my experience, I only get such an error when the function in sourced (so used outside the package).

Viviane

SPINFALLING commented 6 months ago

Hi Viviane, I am running a multi-trajectory model using the mpjlcmm function with three indices: X1, X2, and X3, concerning their grouping from trajectory latent class 1 to 5. I have more than 50 thousands records, so it really takes me a lot of time. …

man_cubic_X1_3 <- hlme(X1~age60+I(age60^2)+I(age60^3)+hypoglycemia,
                   random=~age60,
                   subject="ParticipantID",
                   data=JinLin_man,ng=3,
                   mixture=~age60+I(age60^2)+I(age60^3),
                   B=random(man_cubic_FBG1),maxiter=0)

man_cubic_Three_3 <- mpjlcmm(longitudinal=list(man_cubic_X1_3,man_cubic_X2_3,man_cubic_X3_3),
                         subject="ParticipantID",
                         ng=3,
                         data=JinLin_man,
                         maxiter=199,
                         B=man_cubic_Three_1

) …

Here is my code, I’m trying to using multiple nodes to speed up the calculation

fit_trajectory_models <- function(data, ng, initial_models) {
models <- list()
if (ng == 1) {
models$X1 <- hlme(X1~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        maxiter = 100)

     models$X2 <- hlme(X2~ age60 + I(age60^2) + I(age60^3),
                         random = ~age60,
                         subject = "ParticipantID",
                         data = data,
                         maxiter = 100)

     models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        maxiter = 100)
   } else {
     models$X1 <- hlme(X1 ~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        ng = ng,
                        mixture = ~age60 + I(age60^2) + I(age60^3),
                        B = random(initial_models$FBG), maxiter = 0)

     models$X2 <- hlme(X2 ~ age60 + I(age60^2) + I(age60^3),
                         random = ~age60,
                         subject = "ParticipantID",
+                         data = data,
                         ng = ng,
                         mixture = ~age60 + I(age60^2) + I(age60^3),
                         B = random(initial_models$LDLc), maxiter = 0)

    models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        ng = ng,
                        mixture = ~age60 + I(age60^2) + I(age60^3),
                        B = random(initial_models$SBP), maxiter = 0)
   }

   return(models)
}
nitial_models <- fit_trajectory_models(JinLin_man, 1, NULL)
ng_list <- 2:5

trajectory_models <- foreach(ng = ng_list, .combine = 'c', .packages = 'lcmm', .export = 'initial_models') %dopar% {
fit_trajectory_models(JinLin_man, ng, initial_models)
}

I received an "object not found" error, even though I have my 'initial models' ready.

Error in { : task 1 failed - "object 'initial_models' not found" In addition: Warning message: In e$fun(obj, substitute(ex), parent.frame(), e$data) : already exporting variable(s): initial_models

I'm also wondering if there exists other way to speed up the calculation

Many thanks, Jerry

chrisraynerr commented 6 months ago

Hello,

I am running a single multlcmm model.

  Mod0 <-
    multlcmm(
      ng      = 1,
      fixed   =  SCL1 + SCL2 + SCL3 + SCL4 ~ poly(Age,3) + CurrPreg + Postnatal + factor(Parity) + Plurality,
      random  = ~ 1 + Age,
      data    = DfR,
      subject = "ID",
      link    = "thresholds",
      methInteg = "QMC",
      nMC       = 1000,
      returndata = T,
      nproc   = NCORES
    )
cat > MultLcmm_IrtModel.sh << EOT
#!/bin/bash
export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
source /cluster/bin/jobsetup
cd $working_dir
module purge
module load R/4.2.0-foss-2021b

Rscript --vanilla MultLcmm_IrtModel.R
EOT

sbatch --time=5-00:00:00 --cpus-per-task=10 --mem-per-cpu=20G --job-name=MultLcmm_IrtModel --output=logs/%x.%j.out \
MultLcmm_IrtModel.sh

Any obvious errors here? Are all the functions required exported to the different nodes? (I saw this as a suggestion on stackoverflow).

Thanks in advance! Chris

VivianePhilipps commented 6 months ago

Hi Chris,

I don't see any error here. I do it in the same way and it works correctly. Sorry, I don't know how to help you.

Viviane

VivianePhilipps commented 5 months ago

For Jerry's question :

I'm aware of that problem but I don't know how to fix it. The only solution I have is to use a global variable :


fit_trajectory_models <- function(data, ng, initial_models) {
models <- list()
if (ng == 1) {
models$X1 <- hlme(X1~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        maxiter = 100)

     models$X2 <- hlme(X2~ age60 + I(age60^2) + I(age60^3),
                         random = ~age60,
                         subject = "ParticipantID",
                         data = data,
                         maxiter = 100)

     models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        maxiter = 100)
   } else {
 m1 <<- initial_models$FBG
     models$X1 <- hlme(X1 ~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        ng = ng,
                        mixture = ~age60 + I(age60^2) + I(age60^3),
                        B = random(m1), maxiter = 0)
 m2 <<- initial_models$LDLc
     models$X2 <- hlme(X2 ~ age60 + I(age60^2) + I(age60^3),
                         random = ~age60,
                         subject = "ParticipantID",
                         data = data,
                         ng = ng,
                         mixture = ~age60 + I(age60^2) + I(age60^3),
                         B = random(m2), maxiter = 0)
     m3 <<- initial_models$SBP
    models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
                        random = ~age60,
                        subject = "ParticipantID",
                        data = data,
                        ng = ng,
                        mixture = ~age60 + I(age60^2) + I(age60^3),
                        B = random(m3), maxiter = 0)
   }

   return(models)
}

If someone has a better solution, feel free to share it!

Viviane

chrisraynerr commented 4 months ago

Hi Viviane,

I'm still unable to run multlcmm with nproc > 1. The same error occurs "object 'c_loglikmultlcmm' not found" and (as a novice) it looks like the fortran library is not exported / accessible to the parallel processes.

Can I just check, are you able to run multlcmm with nproc>1 as a batch job on a HPC?

If so: do you include any exports in the bash script? do you create a cluster in the Rscript? do you export any functions / packages / objects within the Rscript?

i have tried: parallel::clusterEvalQ(cl,library(lcmm))
parallel::clusterEvalQ(cl,dyn.load('library/lcmm/libs/lcmm.so'))

Many thanks! Chris

VivianePhilipps commented 4 months ago

Hi Chris,

yes, I am able to run multcmm on a HPC. Sorry, I don't understand why you have troubles. Below are the R script and submission script I use.

Viviane

## The R script (test.R) : 
library(lcmm)

nn <- 3
m0 <- multlcmm(HIER ~ age, random = ~ 1, subject = "ID", data = paquid, 
link = "thresholds", nMC = 1000, maxiter = 10, nproc = nn, verbose = TRUE)
save(m0, file = "m0.RData")

q("no")
## end of R script
## The submission script (job.sh) : 
#!/bin/sh

# walltime (hh:mm::ss)
#PBS -l walltime=3:00:00

# Specify the number of nodes(nodes=) and the number of cores per nodes(ppn=) to be used
#PBS -W x=PARTITION:DEFAULT 
#PBS -l nodes=3:ppn=1
#PBS -N simul

module purge
module load R

Rscript --vanilla test.R
## end of submission script

And I submit with : sbatch job.sh

chrisraynerr commented 4 months ago

Hi Viviane,

Thanks very much for your response. I have tried many combinations of nodes, cores and memory, but whenever I have nproc>1 I get this same error. I work on a SLURM based system and am required to specify --mem-per-cpu. I have contacted the admin of our HPC to see if they can shed any light on this. Can I ask which part of the function is processed in parallel? How does it speed the process up?

Many thanks -- and sorry to drag this issue on! I am hoping to run several more models and currently the model takes 12 days.

VivianePhilipps commented 4 months ago

Hi,

the optimization algorithm is parallelized. This is not done directly in the lcmm package, but in the marqLevAlg package used in all our functions to optimize the parameters. You can find details about marqLevAlg here : https://journal.r-project.org/archive/2021/RJ-2021-089/index.html

Viviane

chrisraynerr commented 3 months ago

Hi Viviane, Just an update with this one. After the R module available on the cluster was updated from 4.2.0 to 4.3.2, the parallelisation step now works. Apologies for all the confusion -- and many thanks for your responses! Best, Chris