Bioconductor / BiocParallel

Bioconductor facilities for parallel evaluation
https://bioconductor.org/packages/BiocParallel
65 stars 29 forks source link

Error using conflicted with BiocParallel in R4.2.0 #223

Closed hxin closed 1 year ago

hxin commented 1 year ago

Hi,

We have used conflict package with BiocParallel in our pipeline without issue with R 4.0.3. We recently updated R to 4.2.0 we get an error as below: Error: worker evaluation failed: there is no package called ‘.conflicts’

I have reported the issue in the conflict repository with detail to reproduce the issue at here, however just want to also mention it here as I am not sure if the issue is rooted in the conflict package or in BioParallel.

Thanks

mtmorgan commented 1 year ago

Thanks, I responded to the conflicted issue with comment https://github.com/r-lib/conflicted/issues/71#issuecomment-1269729941


A recent version of BiocParallel tries to be 'smarter' about automatically exporting variables used in the parallel function. Turn this behavior off by adding exportvariables = FALSE to the SnowParam() object

> bplapply(c(1:2), prepare_worker, BPPARAM = SnowParam())
Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error:
Error: worker evaluation failed:
  there is no package called ‘.conflicts’
> bplapply(c(1:2), prepare_worker, BPPARAM = SnowParam(exportvariables=FALSE))
[[1]]
[1] "conflicted" "stats"      "graphics"   "grDevices"  "utils"
[6] "datasets"   "methods"    "base"

[[2]]
[1] "conflicted" "stats"      "graphics"   "grDevices"  "utils"
[6] "datasets"   "methods"    "base"

Thanks for the simple reproducible example!


but @Jiefei-Wang or I will be able to understand whether this particular problem can be avoided more directly so will leave the current issue open.

hxin commented 1 year ago

Thanks a lot for the quick reply.

Adding the parameter exportvariables = FALSE does solve the problem in the newer version, however, it seems that in the SnowParam function will check the parameter and raise an error if the parameter is not defined as input of the function. Is there a backward-compatible approach to solve this issue so our old script can still work with the newer version of BioParallel?

library(BiocParallel)
para <- SnowParam(workers = 2, exportvariables = FALSE)
Error in .prototype_update(.SnowParam_prototype, .clusterargs = clusterargs,  :
  all(names(args) %in% names(prototype)) is not TRUE
mtmorgan commented 1 year ago

Create the param and update if in the newer version?

p = SnowParam()
if (packageVersion("BiocParallel") >= "1.30.0")
    bpexportvariables(p) <- FALSE

Actually, from news(package = "BiocParallel") the change was introduced in 'devel' version 1.29.19, so the test could be `packageVersion("BiocParallel") >= "1.29.19".

hxin commented 1 year ago

Thank you for the response. What I mean was to have the new version work without changing the old scripts. For example, if the exportvariables is by default FLASE, then it will not create this issue in the old scripts. Or if there is a way to globally control the default value of this parameter so I don't need to change anything in the old scripts? something like below:

echo 'exportvariables = FALSE ' >> .Rprofile
Rscript old_script.R
mtmorgan commented 1 year ago

There is no way like that currently, but it is possible that this can be fixed in an updated version of BiocParallel; that might take several days / a week.

hxin commented 1 year ago

Thank you very much for your help. Yes if the exportvariables = TRUE stop causing this issue, that can be another solution.

Jiefei-Wang commented 1 year ago

A recent version of BiocParallel tries to be 'smarter' about automatically exporting variables used in the parallel function.

This is the case where BiocParallel is not 'smarter' enough...

The root problem is that the function utils::find used in BiocParallel to search for undefined symbols can return internal objects(starts with a dot). I do not exactly know what .conflicts is for but it seems necessary to filter these internal names out. I can create a pull request to fix this issue.

> find("library")
[1] ".conflicts"   "package:base"