Closed giantwhale closed 5 years ago
@richcalaway might be able to answer your question. Let me see if I can get him on this thread.
not sure what's exactly in your "#time consuming processes", but the error messages you're seeing could mean that some of your workers died during the run--this is usually harmless as the tasks get farmed back out to living workers until all the tasks are completed. But I'd need an example that shows the behavior--I'm unable to reproduce from the above.
Here is a reproducible example, on MacBook Air Sierra:
library(parallel) N <- 1e6 X <- matrix(rnorm(N), ncol = 2) cl <- makeForkCluster(nnodes = 2) f <- function(i) X %*% matrix(c(1, 1), ncol = 1) Y <- parLapply(cl, 1:3, f)
For N about 5000, the code works fine on my computer. But N = 10000 already creates the reported error.
This last example does not use foreach/doParallel; parLapply is a straight call into the parallel package, so does not reproduce the original complaint. For what it's worth, it does not reproduce on my CentOS 7 development machine as written. Again, unless the return value is not what is expected, the existence of such messages in the log does not necessarily indicate a problem.
I typically parallelize my code as follows:
The process usually succeeds, however, when I check the log file, I usually see the following at the end of the file:
This is reproducible on my PC. Does this hint something is wrong? how can I find out where things went wrong?