YosefLab / scone

53 stars 12 forks source link

POTENTIAL BUG: bplapply() and globals #57

Closed HenrikBengtsson closed 8 years ago

HenrikBengtsson commented 8 years ago

Just a comment after having talked about bplapply() with @drisso the other day;

Make sure to test the functions that call bplapply() with other backends than sequential and multicore too. For instance, test them with backends that run in separate background R processes, e.g. PSOCK clusters (parallel::makeCluster(2L)). Because, these require that all variables are properly exported to the worker (something you don't have to worry about when you use sequential and multicore workers).

Example:

> library("BiocParallel")
> a <- 42

> register(MulticoreParam(workers = 2))
> y <- bplapply(1:3, FUN = function(x) x + a)
> unlist(y)
[1] 43 44 45

> register(SnowParam(workers = 2, type = "SOCK"))
> y <- bplapply(1:3, FUN = function(x) x + a)
starting worker localhost:11794
starting worker localhost:11794
Error: BiocParallel errors
  element index: 1, 2, 3
  first error: object 'a' not found

Here's an example where I suspect you'll get an error because globals (zhat, muhat, coefs_pi) are not exported:

    fit_pi <- bplapply(seq_len(n), function(i) {
      fit <- suppressWarnings(glm(zhat[,i]~log(muhat), family = binomial(link = logit), start=coefs_pi[,i]))
      return(list(coefs=coefficients(fit), fitted=fitted.values(fit)))
    })
drisso commented 8 years ago

I have added (for now in develop) a set of tests running scone with many different back-ends, including BiocParallel::SnowParam(workers=2, type="SOCK").

All tests pass and I also ran the vignette with this back-end with no issue. The particular example that @HenrikBengtsson references is now gone (we decided to not pursue zinb() imputation), but there are still calls to bplapply that look suspicious, like:

  scaled <- bplapply(seq_len(NROW(sc_params)), function(i) scaling[[sc_params[i,2]]](imputed[[sc_params[i,1]]]), BPPARAM=bpparam)

However, this seems to work fine. Perhaps there is some automatic copy going on here? @HenrikBengtsson any thoughts?

HenrikBengtsson commented 8 years ago

Although I'm not a super savvy BiocParallel users, I don't think BiocParallel exports globals automatically, so I am a bit surprised that your most recent bplapply() snippet

scaled <- bplapply(seq_len(NROW(sc_params)), function(i) scaling[[sc_params[i,2]]](imputed[[sc_params[i,1]]]), BPPARAM=bpparam)

doesn't give an error reporting on object 'scaling' not found (or sc_params or imputed) when using an external R process as the backend such as SnowParam(workers=2, type="SOCK"). Are you 100% sure your package tests covers that piece of code?

@mtmorgan, do you have any comments?

mtmorgan commented 8 years ago

BiocParallel is behaving like snow for these backends, so the function and the environment in which it is defined are copied. So

register(SnowParam(2))
fun0 = function(i) a[i]
p0 = function() {
    a = 1:2
    bplapply(1:2, fun0)    # fails
}

p1a = function() {
    fun1 = function(i) a[i]
    a = 1:2
    bplapply(1:2, fun1)    # works
}

p1b = function() {
    a=1:2
    bplapply(1:2, function(i) a[i])
}

p2 = function() {
    fun1 = function(i) a[i]
    a = 1:2
    x = integer(1e8)
    bplapply(1:2, fun1)    # works, expensive!
}

leads to

> bpstart(bpparam())
> p0()
Error: BiocParallel errors
  element index: 1, 2
  first error: object 'a' not found
> system.time(p1a())
   user  system elapsed 
  0.004   0.000   0.074 
> system.time(p2())
   user  system elapsed 
  1.164   0.096   1.583 
> bpstop(bpparam())

I'm pretty sure too that a function defined at the (top level) of package actually has the package namespace serialized to the workers, so all variables in the package are available.

HenrikBengtsson commented 8 years ago

Ah, that could be why it works, i.e. they probably have the p1b() case. Thxs Martin.

drisso commented 8 years ago

Thank you Martin!

Now I have a much better idea of what's going on. Which makes me worried about the large objects that are defined and hence exported. But that's a different issue, so I should close this one.