tudo-r / BatchJobs

BatchJobs: Batch computing with R
Other
85 stars 20 forks source link

methods not loaded by Rscript #27

Closed debugpoint136 closed 10 years ago

debugpoint136 commented 10 years ago

BatchJobs execution gets halted before it can queue the jobs.

options(BatchJobs.on.slave=TRUE, BatchJobs.resources.path='/import/scratch/user/dpuru/BatchJobs-scratch/bmq_7fbd23130dcd/resources/resources_1392927734.RData') library(BatchJobs) Loading required package: BBmisc res = BatchJobs:::doJob(

  • reg=loadRegistry('/import/scratch/user/dpuru/BatchJobs-scratch/bmq_7fbd23130dcd'),
  • ids=c(1L),
  • multiple.result.files=FALSE,
  • disable.mail=FALSE,
  • first=1L,
  • last=2L,
  • array.id=NA) 2014-02-20 14:22:16: Starting job on node stluhpcprd837. Loading registry: /import/scratch/user/dpuru/BatchJobs-scratch/bmq_7fbd23130dcd/registry.RData Loading conf: /import/scratch/user/dpuru/BatchJobs-scratch/bmq_7fbd23130dcd/conf.RData Auto-mailer settings: start=none, done=none, error=none. Setting work dir: /import/scratch/user/dpuru/BatchJobs-scratch Error in sendMail(reg, job, result.str, "", disable.mail, condition = "start", : could not find function "is" Calls: -> doSingleJob -> sendMail Setting work back to: /import/scratch/user/dpuru/BatchJobs-scratch Memory usage according to gc: used (Mb) gc trigger (Mb) max used (Mb) Ncells 306867 16.4 467875 25 350000 18.7 Vcells 448981 3.5 905753 7 905753 7.0 Execution halted

Rscript does not load package "methods", so simply putting library(methods) at the beginning of your script, gets us past this error, and queues the jobs, but eventually all the jobs expire due the exact same issue occurring when the slave jobs begin execution on the nodes.

Sys.sleep(0.000000) options(BatchJobs.on.slave=TRUE, BatchJobs.resources.path='/import/scratch/user/dpuru/BatchJobs- scratch/bmq_1e8658cc07c6/resources/resources_1392937763.RData') library(BatchJobs) res = BatchJobs:::doJob( reg=loadRegistry('/import/scratch/user/dpuru/BatchJobs- scratch/bmq_1e8658cc07c6'), ids=c(1L), multiple.result.files=FALSE, disable.mail=FALSE, first=1L, last=2L, array.id=NA) BatchJobs:::setOnSlave(FALSE)

Is there was a way to include library(methods) in these files too?

mllg commented 10 years ago

I have troubles reproducing this. I created a registry and cd'd to the first job file and then ran

R_DEFAULT_PACKAGES="stats" Rscript --vanilla 1.R

BatchJobs is not depending on methods, but some of its imported packages are. As far as I can tell, methods will get loaded:

Loading required package: BBmisc
Loading required package: methods
2014-02-24 11:58:10: Starting job on node computed.
[...]

Which version of R are you using? Have you called update.packages() recently? Is setting packages="methods" in makeRegistry/batchMapQuick a possible temporary workaround?

berndbischl commented 10 years ago

BatchJobs is not depending on methods, but some of its imported packages are.

This is correct.

But I thought there is a general R issue that RScript does not load methods automatically?

But I wonder why we never see this problem in our tests? Does anybody else get this?

mllg commented 10 years ago

BatchJobs is not depending on methods, but some of its imported packages are.

This is correct.

Are you sure? I don't think we used something from methods. Try setting the the environment variable to NULL (or set the option defaultPackages in your Rprofile) to not load methods, but things will get ugly because those checks are rarely performed.

Fun fact: I just discovered we should import stats for setNames and maybe others (why is this in stats?).

berndbischl commented 10 years ago

All of the damn checks from R CMD CHECK and we still run into this crap...

Lets look at it when we meet in person.

mllg commented 10 years ago

I tried to set the environment variable in makeR ... breaks roxygen! What a mess...

HenrikBengtsson commented 10 years ago

The 'methods' introduces substantial overhead when loaded, which is ok when running interactively (e.g. R), but not when running quick and short batch scripts (e.g. Rscript). I'm pretty sure that's why they differ.

Not sure who OP is, but it is could be that OP's job script/function rely (explicitly or implicitly) on S4 functionality. There may be an S4 object in stored or similar.

/Henrik

On Mon, Feb 24, 2014 at 3:41 AM, Michel notifications@github.com wrote:

I tried to set the environment variable in makeR ... breaks roxygen! What a mess...

Reply to this email directly or view it on GitHubhttps://github.com/tudo-r/BatchJobs/issues/27#issuecomment-35878360 .

debugpoint136 commented 10 years ago

Version of R : 3.0.2

Adding packages=c("Methods") to the REG so as to make it required load solved this error. For example :- REG = batchMapQuick(RUN_SYSTEM,SAMPLES[,"COMMANDS"],resources=RESOURCES,packages=c("methods"))

berndbischl commented 10 years ago

I have now added library(methods) in the beginning of the slave script to avoid this.

IMHO we can all live with the slight overhead this produces (considering the kinds of jobs we are running), when this avoids problems like this where stuff just breaks.

Can we close?

mllg commented 10 years ago

We could just depend on methods instead? What about #28?

berndbischl commented 10 years ago

Yes, depending might be better.

The problem with #28, which I did look at, is that it is a pretty complicated change. At least potentially. The problem is not which changing the code as much as testing everything it affects.

mllg commented 10 years ago

Fixed in 8f8736a489c19562d972aaac08b1738219b198f6.