amices / mice

Multivariate Imputation by Chained Equations
https://amices.org/mice/
GNU General Public License v2.0
428 stars 107 forks source link

Futuremice with future.plan='cluster' #554

Closed samuelsaari closed 1 year ago

samuelsaari commented 1 year ago

I am trying to parallize mice with futuremice with MPI.

Similar settings work for me with future_map and parallellized foreach but here it does not.

Here is a MWE that works with multisession but not with cluster parallelization:

# file name: z1det_futuremice.R
rm(list=ls())
.libPaths(c("/projappl/project_2003758/project_rpackages", .libPaths())) # change your libpath here
libpath <- .libPaths()[1]

# packages
libraries_cran <- c("mice",
                    "emmeans",
                    "lavaan",
                    "tidyverse",
                    "future",
                    "jtools",
                    "extrafont",
                    "doSNOW",
                    "glue",
                    "showtext")

# Install packages if not yet installed
installed_libraries_cran <- libraries_cran %in% rownames(installed.packages())

if (any(installed_libraries_cran == FALSE)) {
  install.packages(libraries_cran[!installed_libraries_cran],lib = libpath)
}

# loading libraries 
lapply(libraries_cran, require, character = TRUE)

# loading in the data
t_nhanes <- tibble(nhanes)

# regular multiple imputation
mice_data <- mice(t_nhanes,m=3,maxit=1) # adjust methods

#futuremice with multisesssion # works fine (interactively)
# mice_multisession <- mice::futuremice(t_nhanes,
#                                   m=4,
#                                   maxit=2,
#                                   parallellseed=42,
#                                   future.plan='multisession',
#                                   n.core=3)
# print(mice_multisession)

########################################################
#futuremice with cluster

# setting up clusters
options(future.availableCores.methods = "Slurm")
cl <- parallel::makeCluster(workers=4, type = "MPI")
registerDoSNOW(cl)

##..Displaying info..
print('..........Worker process allocation among nodes............')
worker_allocation <- clusterCall(cl, function() Sys.info()[c("nodename", "machine")])
print(worker_allocation)
print(glue(".....Available cores: {availableCores()}....."))
print(glue(".....nbrOfWorkers: {future::nbrOfWorkers()}....."))
print('.....................')

mice_cluster <- mice::futuremice(t_nhanes,
                                  m=4,
                                  maxit=2,
                                  parallellseed=42,
                                  future.plan='cluster', # does not work with 'cluster'
                                  n.core=NULL)  
####### NB!########################
# will stop with error message: slurmstepd: error: *** STEP 16573892.0 ON r07c01 CANCELLED AT 2023-05-12T09:49:33 DUE TO TIME LIMIT ***

stopCluster(cl)

print(mice_cluster)

And the batch job script:

#!/bin/bash -l
#SBATCH --job-name=r_multicore_flh
#SBATCH --account=project_1234
#SBATCH --output=x_output_all_%j.R
#SBATCH --error=x_errors_%j.R

#SBATCH --time=00:14:59

#SBATCH --partition=test
#SBATCH --mem=24GB
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=5 

# Load r-env-singularity
module load r-env-singularity

# Clean up .Renviron file in home directory
if test -f ~/.Renviron; then
    sed -i '/TMPDIR/d' ~/.Renviron
    sed -i '/OMP_NUM_THREADS/d' ~/.Renviron
fi

# Specify a temp folder path
echo "TMPDIR=/scratch/project_1234" >> ~/.Renviron

## working directory
mkdir /scratch/project_1234/username/out_${SLURM_JOB_ID}
cd /scratch/project_1234/username/out_${SLURM_JOB_ID}

#################################################################################
# Run the R script

srun singularity_wrapper exec RMPISNOW --no-save --quiet < /users/username/flh/z1det_futuremice.R 2>&1 | tee x1det_futuremice_${SLURM_JOB_ID}.R
thomvolker commented 1 year ago

Have you tried running futuremice() with n.core = cl?

samuelsaari commented 1 year ago

That could be in the right direction but get the following error with n.core = cl

Error in check.cores(n.core, available, m) : 
  'list' object cannot be coerced to type 'integer'
Calls: <Anonymous> -> check.cores
Execution halted
srun: error: r07c01: task 0: Exited with exit code 1
srun: launch/slurm: _step_signal: Terminating StepId=16578568.0
slurmstepd: error: *** STEP 16578568.0 ON r07c01 CANCELLED AT 2023-05-12T12:38:29 ***
srun: error: r07c01: tasks 1-4: Terminated
srun: Force Terminated StepId=16578568.0
'/users/makimiik/flh/x_errors_16578568.R' -> './x_errors_16578568.R'
thomvolker commented 1 year ago

I see. Then there is currently no way within futuremice() to set up a cluster environment. We might develop this in the future (no pun intended). For now, the easiest way to go about this is to fall back to furrr::future_map(), execute your mice() call within this function, and stitch the list of imputations in the future_map output together using ibind() (which is also what is done in futuremice(), so you can check out the futuremice() code to see what happens.

stefvanbuuren commented 1 year ago

Thanks. I assume this answers the question, so I'm closing.