mllg / batchtools

Tools for computation on batch systems
https://mllg.github.io/batchtools/
GNU Lesser General Public License v3.0
169 stars 51 forks source link

segfault in batchExport #266

Open rimorob opened 3 years ago

rimorob commented 3 years ago

I'm getting the following segfault when running batchtools via the dpFuture/future.batchtools/batchtools stack. The problem is technically in dir_map but I suspect that the condition that gives it rise is a race condition. I only base this conclusion on the fact that some of the job files have time stamp out of order. I can help to isolate and reproduce this error but could really use advice on what it might be to help narrow in - my current code takes many hours to run on 300 cores to see the error. A run with only 40 cores doesn't produce any issues. It's worth noting that I don't really know the batchtools package at all, hence I'm asking for a "gut feel" for which direction to look in. Thanks in advance!

Boris

caught segfault address 0x7ffe38150ff8, cause 'memory not mapped'

Traceback: 1: dir_map(old, identity, all, recurse, type, fail) 2: dir_ls(old, type = "directory", recurse = TRUE, all = TRUE) 3: dir_delete(old[dirs]) 4: fs::file_delete(x[fs::file_exists(x)]) 5: file_remove(file) 6: (function (object, file, compress = "gzip") { file_remove(file) saveRDS(object, file = file, version = 2L, compress = compress) waitForFile(file, 300) invisible(TRUE)})(object = dots[[1L]]\ [[23L]], file = dots[[2L]][[23L]], compress = dots[[3L]][[1L]]) 7: mapply(FUN = f, ..., SIMPLIFY = FALSE) 8: Map(writeRDS, object = export, file = fn, compress = reg$compress) 9: batchExport(export = future$globals, reg = reg) 10: run.BatchtoolsFuture(future)