mschubert / clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
https://mschubert.github.io/clustermq/
Apache License 2.0
146 stars 27 forks source link

Can we improve error message for missing packages on the worker? #239

Closed rrichmond closed 1 year ago

rrichmond commented 3 years ago

Hello,

I'm having issues using clustermq with data.table. Here's a simple example:

library(data.table)
library(clustermq)
library(foreach)
library(doParallel)

nvals <- 10
tmpdat <- data.table(a=1:nvals,b=nvals:1)

clustermq::register_dopar_cmq(n_jobs=4)

out <- foreach(idx=1:nrow(tmpdat)) %dopar% {
    myrow <- tmpdat[idx]

    myrow$a*myrow$b
}
out

Which returns

Running sequentially ('LOCAL') ...
Error in summarize_result(re$result, length(re$errors), length(re$warnings),  : 
  8/10 jobs failed (0 warnings). Stopping.
(Error #10) undefined columns selected
(Error #3) undefined columns selected
(Error #4) undefined columns selected
(Error #5) undefined columns selected
(Error #6) undefined columns selected
(Error #7) undefined columns selected
(Error #8) undefined columns selected
(Error #9) undefined columns selected
In addition: Warning message:
In if (class(data$export[[i]]) == "function") environment(data$export[[i]]) = .GlobalEnv :
  the condition has length > 1 and only the first element will be used
> > 

If instead I use doParallel everything works fine:

registerDoParallel()

out2 <- foreach(idx=1:nrow(tmpdat)) %dopar% {
    myrow <- tmpdat[idx]

    myrow$a*myrow$b
}
out2

Any suggestions on how to debug or what is happening here? Thanks!

mschubert commented 3 years ago

The issue you are running into here is that the worker operates in its own environment, which does not have data.table loaded.

If you tell clustermq to load the data.table package, it works as expected:

library(data.table)
library(clustermq)
library(foreach)
library(doParallel)

nvals <- 10
tmpdat <- data.table(a=1:nvals,b=nvals:1)

clustermq::register_dopar_cmq(n_jobs=4, pkgs=c("data.table"))

out <- foreach(idx=1:nrow(tmpdat)) %dopar% {
    myrow <- tmpdat[idx]

    myrow$a*myrow$b
}
out
#> [[1]]
#> [1] 10
#> 
#> [[2]]
#> [1] 18
#> 
#> [[3]]
#> [1] 24
#> 
#> ...

Ideally, the error would tell you that the data.table package is missing instead of a column subset error.

I will leave this open until I can find out if there is an easy solution to this.

rrichmond commented 3 years ago

Whoops I should have caught that. Thanks! This is definitely a tough one since [ will fall back to data.frames function.