gdrplatform / gDRcore

R package to process dose-response curve data with the GR methods
https://gdrplatform.github.io/gDRcore
1 stars 1 forks source link

Setting MulticoreParam global option breaks BiocParallel::register #25

Closed ChristopherEeles closed 1 year ago

ChristopherEeles commented 1 year ago

In zzz.R you have set a global option for the default BiocParallel back-end. This breaks the the normal mechanism for mutating the BiocParallel back-end via BiocParallel::register.

If you need to register a default back-end, use:

BiocParallel::register(BPPARAM)

But IMO BiocParallel does a pretty good job of picking appropriate defaults so this is probably unnecessary.

It is also idomatic to allow users to pass BPPARAM into any function which is being parallelized via BiocParallel. This (1) documents that the function is parallelized and (2) allows setting a runtime back-end without mutating global state. This can be done explictly with a BPPARAM argument or via passing ... to the BiocParallel calls (which is probably preferable since there are other options you can set).

Also a general FYI that if you are submitting to Bioconductor (or CRAN) they do not like it when you mutate a users global state. You can work around this by setting your options inside the appropriate package namespace (i.e. via assign) instead of the global environment or by setting the variable temporarily inside your functions and resetting via on.exit when they finish executing.

If you absolutely need global options, at least prepend the variables with your package name to ensure they do not override any user or other package defaults.

REPREX

library(BiocParallel)
options("MulticoreParam")
## $MulticoreParam
## NULL
nthread <- 10L
BiocParallel::register(MulticoreParam(workers=nthread))
BiocParallel::bpparam()
## class: MulticoreParam
##   bpisup: FALSE; bpnworkers: 10; bptasks: 0; bpjobname: BPJOB
##   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
##   bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
##   bpexportglobals: TRUE
##   bplogdir: NA
##   bpresultdir: NA
##   cluster type: FORK
library(gDR)
## NOTE: It sets a global option
options("MutlicoreParam")
## $MulticoreParam
## class: MulticoreParam
##   bpisup: FALSE; bpnworkers: 22; bptasks: 0; bpjobname: BPJOB
##   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
##   bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
##   bpexportglobals: TRUE
##   bplogdir: NA
##   bpresultdir: NA
##   cluster type: FORK
## NOTE: The option overrides my registered back-end
BiocParallel::bpparam()
## class: MulticoreParam
##   bpisup: FALSE; bpnworkers: 22; bptasks: 0; bpjobname: BPJOB
##   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
##   bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
##   bpexportglobals: TRUE
##   bplogdir: NA
##   bpresultdir: NA
##   cluster type: FORK
## NOTE: And I can no longer change my back-end!
BiocParallel::register(MulticoreParam(workers=nthread))
BiocParallel::bpparam()
## class: MulticoreParam
##   bpisup: FALSE; bpnworkers: 22; bptasks: 0; bpjobname: BPJOB
##   bplog: FALSE; bpthreshold: INFO; bpstopOnError: TRUE
##   bpRNGseed: ; bptimeout: 2592000; bpprogressbar: FALSE
##   bpexportglobals: TRUE
##   bplogdir: NA
##   bpresultdir: NA
##   cluster type: FORK
ChristopherEeles commented 1 year ago

It also seems that deleting the option in the global environment doesn't work. Maybe because the spawned processes are reloading gDR or gDRcore?

ChristopherEeles commented 1 year ago

Nevermind deleting the option does work but there are nested calls to parallelize inside of map_df (which is also a NAMESPACE clash with purrr:::map_df making it kind of confusing).

bczech commented 1 year ago

Hi @ChristopherEeles,

setting global options was removed in #27.