mllg / batchtools

Tools for computation on batch systems
https://mllg.github.io/batchtools/
GNU Lesser General Public License v3.0
172 stars 51 forks source link

makeClusterFunctionsMulticore seems to waste a lot of memory #39

Closed fabian-s closed 8 years ago

fabian-s commented 8 years ago
make_data <- function(data, scale, job=NULL) {
  gamSim(eg = 1, n = 4000, dist = "normal", scale = scale, 
    verbose = FALSE)
}

fit_model <- function(data, job=NULL, instance) {
  m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = instance)
  m$coefficients
}

library(batchtools)
file.dir <- paste0("testtest_", Sys.Date())
reg <- makeExperimentRegistry(file.dir = file.dir, packages = "mgcv", 
  seed = 1)
reg$cluster.functions <- makeClusterFunctionsMulticore(25)
saveRegistry()

# Add problem and algorithms to the registry
addProblem(name = "make_data", data = NULL, fun = make_data, seed = 1)
addAlgorithm(name = "fit_model", fun = fit_model)

# Add experiments
problems <- list(make_data = data.frame(scale = 2 ^ (-4 : 4)))
addExperiments(problems, repls = 50)

#
options(error = function( ) dump.frames("batchtools.dump", to.file = TRUE))
# testJob(1)
submitJobs()

This results in

# Submitting 450 jobs in 450 chunks using cluster functions 'Parallel' ...
# Submitting [===================================----------------]  70% eta: 14sError in mcfork(detached) : 
#   unable to fork, possible reason: Cannot allocate memory

Closing the session, reopening a fresh one, loading the registry and doing submitJobs() again immediately triggers the same error after "x files synced" is written to the console.

Exactly the same experiment setup works fine with reg$cluster.functions <- makeClusterFunctionsSocket(25) instead of reg$cluster.functions <- makeClusterFunctionsMulticore(25).

Until the R session that throws the "unable to fork"- error is closed nothing else works properly (some browser tabs crash (?), other open R sessions all fail with "Cannot allocate memory")

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mgcv_1.8-13      nlme_3.1-128     batchtools_0.1   data.table_1.9.6

loaded via a namespace (and not attached):
 [1] lattice_0.20-33   snow_0.4-1        prettyunits_1.0.2 digest_0.6.10    
 [5] assertthat_0.1    chron_2.3-47      grid_3.3.1        R6_2.1.3         
 [9] backports_1.0.3   magrittr_1.5      progress_1.0.2    stringi_1.1.1    
[13] Matrix_1.2-6      checkmate_1.8.1   tools_3.3.1       parallel_3.3.1   

> packageDescription("batchtools")
Package: batchtools
Title: Tools for Computation on Batch Systems
Version: 0.1
[...]
Built: R 3.3.0; x86_64-pc-linux-gnu; 2016-08-22 15:16:02 UTC; unix
[...]
RemoteSha: e34e069ce00e2d9e727cfedaf7e2278751f0cfad
[...]
mllg commented 8 years ago

Yep, the finished threads were not collected. Should be fixed in a0edae40dc30141d4215367f3971169cae68e093. Can you confirm?

fabian-s commented 8 years ago

Awesome, thanks, that does it!