Bioconductor / BiocParallel

Bioconductor facilities for parallel evaluation
https://bioconductor.org/packages/BiocParallel
67 stars 29 forks source link

Supporting MulticoreParam on Windows #98

Closed LTLA closed 5 years ago

LTLA commented 5 years ago

This is a continuation of discussions I've had with @mtmorgan on the BioC slack channel, put here to allow us to keep better track of things (and hopefully reach some conclusion).

The crux of the issue is that scran and BiocSingular were failing on the Windows build machines because their example/test/vignette code used MulticoreParam(). This was not a problem in the last release where calling MulticoreParam() on Windows yielded a SerialParam object. However, this fail-safe was removed in the BioC-devel version of BiocParallel, hence the errors about, e.g., unexported mcparallel.

This led to a discussion of whether MulticoreParam() should return a SerialParam on Windows. As forking isn't supported on Windows, it is philosophically sensible for MulticoreParam() to simply fail there. However, this would be quite inconvenient from a practical perspective. It is very common for users to use MulticoreParam() in their analysis scripts, as this is the easiest and fastest parallelization scheme on non-Unix systems where most heavy-duty processing is done; these scripts would fail hard on Windows.

Users would then either have to manually change the BPPARAM= to get the scripts running again; or everyone would write defensive boilerplate code in their scripts to the tune of:

param.opt <- if (.Platform$OS.type=="windows") {
    SerialParam()
} else {
     MulticoreParam(2)
}

But if that's the case, we might as well have MulticoreParam() spit out a SerialParam in the first place, saving everyone some time and effort. I'll point out that using bpparam() is unacceptable here as SnowParam() is too slow to pass the BioC Windows builds.

Edit: The scran errors are actually slightly different now - some of these are expected, so the most relevant failure to this discussion is that there seems to be a change to how the usage (not construction) of the return value of MulticoreParam() affects the seed on Windows... bit of a head-scratcher.

LTLA commented 5 years ago

Nudging this issue; is this going to get resolved before the next release? Currently both scran and BiocSingular are failing due to MulticoreParam relying on an unexported mcparallel on Windows.

mtmorgan commented 5 years ago

By default, BiocParallel registers the MulticoreParam backend on linux-like, and SnowParam on non-Linux (aka Windows), and the default bpparam() just does the right thing, so I'm not seeing why 'everyone' has to do anything other than nothing?

LTLA commented 5 years ago

That's the thing - I tried using bpparam() in my examples/tests and I timed out on the build machines. Even the one-off examples were taking 10 times as long; it wasn't (only) a question of doing bpstart at the start of every function or before each function call, though that does help.

Some people would also like to be able to specify the number of workers rather than the default, and bpparam doesn't give the option to do that. So if we were to not use MulticoreParam, I would ask for:

mtmorgan commented 5 years ago

I guess the long evaluation times are because you're serializing large objects to the workers, or because you're creating workers each of which takes up a very large amount of space? The 'easy' simulation is to use SnowParam on your own system. Both SnowParam and MulticoreParam respect options(mc.cores=). Yes probably MulticoreParam() should just say 'no can do' on Windows.

LTLA commented 5 years ago

SnowParam is fast enough on my local machine, so I don't know why it causes time-outs on the BioC machines. If I had to guess, I would say that every Delayed::%*% was dispatching out to SnowParam on the windows machines; this added a start-up cost that accumulated over many short calls to a function that was not expecting any parallelization (and thus had no reason for me to bpstart, etc.). Presumably the start-up costs are much higher on the build machines, though I don't know how they're configured.

It seems like a bpparam equivalent that switches between Multicore/Serial would be useful - DelayedArray does something similar to avoid Snow. In the meantime, I have added code like this to all my unit tests:

safeBPParam <- function(nworkers) {
    if (.Platform$OS.type=="windows") {
        BiocParallel::SerialParam()
    } else {
        BiocParallel::MulticoreParam(nworkers)
    }
}

... to evade Snow on windows and use Multi otherwise on supported linux systems.