mschubert / clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
https://mschubert.github.io/clustermq/
Apache License 2.0
146 stars 27 forks source link

S3 method handlers are not exported automatically on R>4.0 #256

Closed Zhuk66 closed 8 months ago

Zhuk66 commented 3 years ago

The following code does not work with clustermq. It worked with R 3.5.3 and it works in sequential mode. It also works with doParallel but not with clustermq. I tested with multisession and SLURM backends. Both gave me an error:

Error in summarize_result(job_result, n_errors, n_warnings, cond_msgs, : 5/5 jobs failed (0 warnings). Stopping. (Error #1) no applicable method for 'f' applied to an object of class "c('integer', 'numeric')"

library(future)
library(foreach)
library(clustermq)

f <- function(i) {
  UseMethod('f')
}

f.numeric <- function(i) {
  Sys.sleep(10)
  sqrt(i)
}
# It does not do anything
if(getRversion() >= "3.6.0") {
  .S3method('f', 'numeric', 'f.numeric')
}

ncores <- 12
memory <- 1024

options(clustermq.scheduler="multiprocess")

register_dopar_cmq(n_jobs=ncores, memory=memory) 
# registerDoSEQ()
res <- foreach(i=1:ncores) %dopar% { f(i) }
mschubert commented 3 years ago

The issue here seems to be that while f is exported, the associated S3 handlers are not.

Minimal way to reproduce this:

f <- function(i) {
  UseMethod('f')
}

f.numeric <- function(i) {
  sqrt(i)
}

clustermq::Q(f, i=1, n_jobs=1)

A workaround for this is to export the S3 handler explicitly:

clustermq::Q(f, i=1, n_jobs=1, export=list(f.numeric=f.numeric))

This, however, should work without user intervention.

mschubert commented 8 months ago

We'll likely move from foreach::getexports to globals::globalsOf to better identify variables required to be exported for foreach runs.

That said, neiter option currently identifies the S3 handler to be required. But that is an issue that should be fixed in these packages.

I'm closing this as upstream responsibility (see linked issue).