Closed chrisraynerr closed 3 months ago
I had the same problem when i run multiple cores, and i thinks it may related the List delivery problem, you can't deliver list objects like object[[1]]
Hi,
do you get the error within a gridsearch, or with a single multlcmm model? The c_loglikmultlcmm function is the Fortran code that is called in the multlcmm function. I've never had such troubles in the parallel mode. In my experience, I only get such an error when the function in sourced (so used outside the package).
Viviane
Hi Viviane, I am running a multi-trajectory model using the mpjlcmm function with three indices: X1, X2, and X3, concerning their grouping from trajectory latent class 1 to 5. I have more than 50 thousands records, so it really takes me a lot of time. …
man_cubic_X1_3 <- hlme(X1~age60+I(age60^2)+I(age60^3)+hypoglycemia,
random=~age60,
subject="ParticipantID",
data=JinLin_man,ng=3,
mixture=~age60+I(age60^2)+I(age60^3),
B=random(man_cubic_FBG1),maxiter=0)
…
man_cubic_Three_3 <- mpjlcmm(longitudinal=list(man_cubic_X1_3,man_cubic_X2_3,man_cubic_X3_3),
subject="ParticipantID",
ng=3,
data=JinLin_man,
maxiter=199,
B=man_cubic_Three_1
) …
Here is my code, I’m trying to using multiple nodes to speed up the calculation
fit_trajectory_models <- function(data, ng, initial_models) {
models <- list()
if (ng == 1) {
models$X1 <- hlme(X1~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
models$X2 <- hlme(X2~ age60 + I(age60^2) + I(age60^3),
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
} else {
models$X1 <- hlme(X1 ~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
random = ~age60,
subject = "ParticipantID",
data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(initial_models$FBG), maxiter = 0)
models$X2 <- hlme(X2 ~ age60 + I(age60^2) + I(age60^3),
random = ~age60,
subject = "ParticipantID",
+ data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(initial_models$LDLc), maxiter = 0)
models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
random = ~age60,
subject = "ParticipantID",
data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(initial_models$SBP), maxiter = 0)
}
return(models)
}
nitial_models <- fit_trajectory_models(JinLin_man, 1, NULL)
ng_list <- 2:5
trajectory_models <- foreach(ng = ng_list, .combine = 'c', .packages = 'lcmm', .export = 'initial_models') %dopar% {
fit_trajectory_models(JinLin_man, ng, initial_models)
}
I received an "object not found" error, even though I have my 'initial models' ready.
Error in { : task 1 failed - "object 'initial_models' not found" In addition: Warning message: In e$fun(obj, substitute(ex), parent.frame(), e$data) : already exporting variable(s): initial_models
I'm also wondering if there exists other way to speed up the calculation
Many thanks, Jerry
Hello,
I am running a single multlcmm model.
Mod0 <-
multlcmm(
ng = 1,
fixed = SCL1 + SCL2 + SCL3 + SCL4 ~ poly(Age,3) + CurrPreg + Postnatal + factor(Parity) + Plurality,
random = ~ 1 + Age,
data = DfR,
subject = "ID",
link = "thresholds",
methInteg = "QMC",
nMC = 1000,
returndata = T,
nproc = NCORES
)
cat > MultLcmm_IrtModel.sh << EOT
#!/bin/bash
export OMP_NUM_THREADS=1
export OPENBLAS_NUM_THREADS=1
source /cluster/bin/jobsetup
cd $working_dir
module purge
module load R/4.2.0-foss-2021b
Rscript --vanilla MultLcmm_IrtModel.R
EOT
sbatch --time=5-00:00:00 --cpus-per-task=10 --mem-per-cpu=20G --job-name=MultLcmm_IrtModel --output=logs/%x.%j.out \
MultLcmm_IrtModel.sh
Any obvious errors here? Are all the functions required exported to the different nodes? (I saw this as a suggestion on stackoverflow).
Thanks in advance! Chris
Hi Chris,
I don't see any error here. I do it in the same way and it works correctly. Sorry, I don't know how to help you.
Viviane
For Jerry's question :
I'm aware of that problem but I don't know how to fix it. The only solution I have is to use a global variable :
fit_trajectory_models <- function(data, ng, initial_models) {
models <- list()
if (ng == 1) {
models$X1 <- hlme(X1~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
models$X2 <- hlme(X2~ age60 + I(age60^2) + I(age60^3),
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
random = ~age60,
subject = "ParticipantID",
data = data,
maxiter = 100)
} else {
m1 <<- initial_models$FBG
models$X1 <- hlme(X1 ~ age60 + I(age60^2) + I(age60^3) + hypoglycemia,
random = ~age60,
subject = "ParticipantID",
data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(m1), maxiter = 0)
m2 <<- initial_models$LDLc
models$X2 <- hlme(X2 ~ age60 + I(age60^2) + I(age60^3),
random = ~age60,
subject = "ParticipantID",
data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(m2), maxiter = 0)
m3 <<- initial_models$SBP
models$X3 <- hlme(X3 ~ age60 + I(age60^2) + I(age60^3) + hypotensor,
random = ~age60,
subject = "ParticipantID",
data = data,
ng = ng,
mixture = ~age60 + I(age60^2) + I(age60^3),
B = random(m3), maxiter = 0)
}
return(models)
}
If someone has a better solution, feel free to share it!
Viviane
Hi Viviane,
I'm still unable to run multlcmm with nproc > 1. The same error occurs "object 'c_loglikmultlcmm' not found" and (as a novice) it looks like the fortran library is not exported / accessible to the parallel processes.
Can I just check, are you able to run multlcmm with nproc>1 as a batch job on a HPC?
If so: do you include any exports in the bash script? do you create a cluster in the Rscript? do you export any functions / packages / objects within the Rscript?
i have tried:
parallel::clusterEvalQ(cl,library(lcmm))
parallel::clusterEvalQ(cl,dyn.load('library/lcmm/libs/lcmm.so'))
Many thanks! Chris
Hi Chris,
yes, I am able to run multcmm on a HPC. Sorry, I don't understand why you have troubles. Below are the R script and submission script I use.
Viviane
## The R script (test.R) :
library(lcmm)
nn <- 3
m0 <- multlcmm(HIER ~ age, random = ~ 1, subject = "ID", data = paquid,
link = "thresholds", nMC = 1000, maxiter = 10, nproc = nn, verbose = TRUE)
save(m0, file = "m0.RData")
q("no")
## end of R script
## The submission script (job.sh) :
#!/bin/sh
# walltime (hh:mm::ss)
#PBS -l walltime=3:00:00
# Specify the number of nodes(nodes=) and the number of cores per nodes(ppn=) to be used
#PBS -W x=PARTITION:DEFAULT
#PBS -l nodes=3:ppn=1
#PBS -N simul
module purge
module load R
Rscript --vanilla test.R
## end of submission script
And I submit with : sbatch job.sh
Hi Viviane,
Thanks very much for your response. I have tried many combinations of nodes, cores and memory, but whenever I have nproc>1 I get this same error. I work on a SLURM based system and am required to specify --mem-per-cpu. I have contacted the admin of our HPC to see if they can shed any light on this. Can I ask which part of the function is processed in parallel? How does it speed the process up?
Many thanks -- and sorry to drag this issue on! I am hoping to run several more models and currently the model takes 12 days.
Hi,
the optimization algorithm is parallelized. This is not done directly in the lcmm package, but in the marqLevAlg package used in all our functions to optimize the parameters. You can find details about marqLevAlg here : https://journal.r-project.org/archive/2021/RJ-2021-089/index.html
Viviane
Hi Viviane, Just an update with this one. After the R module available on the cluster was updated from 4.2.0 to 4.3.2, the parallelisation step now works. Apologies for all the confusion -- and many thanks for your responses! Best, Chris
Hi Viviane,
When I try to use multiple cores (nproc) to run multlcmm, I get the error:
Error in {: task1 failed - "object 'c_loglikmultlcmm' not found"'
I looked for documentation on c_loglikmultlcmm, but can't find anything. Do you have a solution?
Many thanks, Chris