Closed LTLA closed 5 years ago
Nudging this issue; is this going to get resolved before the next release? Currently both scran and BiocSingular are failing due to MulticoreParam
relying on an unexported mcparallel
on Windows.
By default, BiocParallel registers the MulticoreParam backend on linux-like, and SnowParam on non-Linux (aka Windows), and the default bpparam()
just does the right thing, so I'm not seeing why 'everyone' has to do anything other than nothing?
That's the thing - I tried using bpparam()
in my examples/tests and I timed out on the build machines. Even the one-off examples were taking 10 times as long; it wasn't (only) a question of doing bpstart
at the start of every function or before each function call, though that does help.
Some people would also like to be able to specify the number of workers rather than the default, and bpparam
doesn't give the option to do that. So if we were to not use MulticoreParam
, I would ask for:
bpparamWithoutSnow(nworkers)
function that uses MulticoreParam(nworkers)
on Linux-likes and would just drop to a SerialParam
on Windows. MulticoreParam
on Windows, rather than the error being delayed until someone actually tries to use that object.I guess the long evaluation times are because you're serializing large objects to the workers, or because you're creating workers each of which takes up a very large amount of space? The 'easy' simulation is to use SnowParam on your own system. Both SnowParam and MulticoreParam respect options(mc.cores=)
. Yes probably MulticoreParam() should just say 'no can do' on Windows.
SnowParam
is fast enough on my local machine, so I don't know why it causes time-outs on the BioC machines. If I had to guess, I would say that every Delayed::%*%
was dispatching out to SnowParam
on the windows machines; this added a start-up cost that accumulated over many short calls to a function that was not expecting any parallelization (and thus had no reason for me to bpstart
, etc.). Presumably the start-up costs are much higher on the build machines, though I don't know how they're configured.
It seems like a bpparam
equivalent that switches between Multicore/Serial would be useful - DelayedArray does something similar to avoid Snow. In the meantime, I have added code like this to all my unit tests:
safeBPParam <- function(nworkers) {
if (.Platform$OS.type=="windows") {
BiocParallel::SerialParam()
} else {
BiocParallel::MulticoreParam(nworkers)
}
}
... to evade Snow on windows and use Multi otherwise on supported linux systems.
This is a continuation of discussions I've had with @mtmorgan on the BioC slack channel, put here to allow us to keep better track of things (and hopefully reach some conclusion).
The crux of the issue is that scran and BiocSingular were failing on the Windows build machines because their example/test/vignette code used
MulticoreParam()
. This was not a problem in the last release where callingMulticoreParam()
on Windows yielded aSerialParam
object. However, this fail-safe was removed in the BioC-devel version of BiocParallel, hence the errors about, e.g., unexportedmcparallel
.This led to a discussion of whether
MulticoreParam()
should return aSerialParam
on Windows. As forking isn't supported on Windows, it is philosophically sensible forMulticoreParam()
to simply fail there. However, this would be quite inconvenient from a practical perspective. It is very common for users to useMulticoreParam()
in their analysis scripts, as this is the easiest and fastest parallelization scheme on non-Unix systems where most heavy-duty processing is done; these scripts would fail hard on Windows.Users would then either have to manually change the
BPPARAM=
to get the scripts running again; or everyone would write defensive boilerplate code in their scripts to the tune of:But if that's the case, we might as well have
MulticoreParam()
spit out aSerialParam
in the first place, saving everyone some time and effort. I'll point out that usingbpparam()
is unacceptable here asSnowParam()
is too slow to pass the BioC Windows builds.Edit: The scran errors are actually slightly different now - some of these are expected, so the most relevant failure to this discussion is that there seems to be a change to how the usage (not construction) of the return value of
MulticoreParam()
affects the seed on Windows... bit of a head-scratcher.