mllg / batchtools

Tools for computation on batch systems
https://mllg.github.io/batchtools/
GNU Lesser General Public License v3.0
172 stars 51 forks source link

Descriptive missing job.rds file error #278

Closed stuvet closed 3 years ago

stuvet commented 3 years ago

I've been troubleshooting stability of batchtools when used on Slurm with the default makeClusterFunctionsSlurm (PR #276 & #277 ).

The last (rare) error I can reproduce is:

Expected Behaviour

Problem

Reprex

Error in gzfile(file, "rb") : cannot open the connection
Calls: <Anonymous> -> doJobCollection.character -> readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
  cannot open compressed file '.../jobs/job929872958e6074e5662a4c9hd3f312f4.rds', probable reason 'No such file or directory'

Cause

batchtools:::doJobCollection.character deletes the jobCollection file.rds on the first run, so when the failed job gets requeued the file is no longer there, causing the error.

Workaround

Questions

stuvet commented 3 years ago

Changed to 'Issue'