Closed guidohooiveld closed 7 years ago
I can't reproduce your issue.
> require(DOSE)
Loading required package: DOSE
> packageVersion("DOSE")
[1] ‘3.0.0’
> data(geneList)
> x = gseDO(geneList)
[1] "preparing geneSet collections..."
[1] "GSEA analysis..."
[1] "leading edge analysis..."
[1] "done..."
> head(x, 2)
ID Description setSize enrichmentScore NES
DOID:1492 DOID:1492 eye and adnexa disease 459 -0.3105160 -1.382493
DOID:5614 DOID:5614 eye disease 450 -0.3125247 -1.390034
pvalue p.adjust qvalues rank leading_edge
DOID:1492 0.001230012 0.03797232 0.02535098 1793 tags=22%, list=14%, signal=19%
DOID:5614 0.001239157 0.03797232 0.02535098 1768 tags=22%, list=14%, signal=19%
core_enrichment
DOID:1492 3371/3082/5914/2878/4153/3791/23247/1543/80184/6750/1958/2098/7450/596/9187/2034/482/948/1490/1280/3931/5737/4314/4881/2261/3426/187/629/6403/7042/6785/7507/2934/5176/4060/1277/7078/5950/2057/727/10516/4311/2247/1295/358/10203/2192/582/10218/57125/3485/585/1675/6310/2202/4313/2944/4254/3075/1501/2099/3480/4653/1195/6387/3305/1471/857/4016/1909/4053/6678/1296/7033/4915/55812/1191/5654/10631/2152/2697/7043/2952/6935/2200/3572/7177/7031/3479/2006/10451/9370/771/3117/125/652/4693/5346/1524
DOID:5614 3082/5914/2878/4153/3791/23247/1543/80184/6750/1958/2098/7450/596/9187/2034/482/948/1490/1280/3931/5737/4314/4881/2261/3426/187/629/6403/7042/6785/7507/2934/5176/4060/1277/7078/5950/2057/727/10516/4311/2247/1295/358/10203/2192/582/10218/57125/3485/585/1675/6310/2202/4313/2944/4254/3075/1501/2099/3480/4653/6387/3305/1471/857/4016/1909/4053/6678/1296/7033/4915/55812/1191/5654/10631/2152/2697/7043/2952/6935/2200/3572/7177/7031/3479/2006/10451/9370/771/3117/125/652/4693/5346/1524
From your error msg: Error: 'mcparallel' is not an exported object from 'namespace:parallel'
, it seems that DOSE
import mcparallel
from parallel
, but actually I never import it, https://github.com/GuangchuangYu/DOSE/blob/master/NAMESPACE.
For parallel, DOSE
use BiocParallel
package as backend, which also not import mcparallel
, see https://github.com/Bioconductor/BiocParallel/blob/master/NAMESPACE.
Thanks for having a look at this. I got your point. Unfortunately, exact same error still persists, even when running a freshly installed R v3.3.1 (thus not patched v3.3.1). I only ran the R-installer, followed by biocLite("DOSE"). That was all. Could this be a Win-specific problem?
It could be. You can file an issue of BiocParallel
in bioconductor support site.
OK, will do so.
The DOSE code at https://github.com/GuangchuangYu/DOSE/blob/8b57eec7965d58fc7c84a63d6383707484803aeb/R/gsea.R#L211 does
bpstart(MulticoreParam())
which generates first the warning and then error (use options(warn=1)
to seem them in correct temporal sequence) -- it is trying to start a multicore back-end on unsupported windows.
Error: 'mcparallel' is not an exported object from 'namespace:parallel'
In addition: Warning message:
In MulticoreParam(workers, progressbar = verbose) :
MulticoreParam() not supported on Windows, use SnowParam()
The intention is to let the user specify the back-end either implicitly or by register()
ing the desired back-end. So code use might be
if (!bpisup()) {
bpstart()
on.exit(bpstop())
}
bplapply(1:nperm, function(i) {
if (seed)
set.seed(seeds[i])
perm.gseaEScore(geneList, selected.gs, exponent)
})
(or just bplapply()
, since the cluster is used only for the evaluation).
@mtmorgan thanks for pointing it out.
@guidohooiveld pls test with the github version, it should be fixed.
@GuangchuangYu: yes, working now (although you may want to check the message that are returned).
> packageVersion("DOSE")
[1] ‘3.1.1’
>
> library(DOSE)
> data(geneList)
> x <- gseDO(geneList,by="DOSE")
preparing geneSet collections...
calculating observed enrichment scores...
calculating permutation scores...
<environment: namespace:base>
cpu
elapsed
transient
<environment: namespace:base>
expr
...
<environment: namespace:DOSE>
geneList
geneSets
exponent
[1]
i
[2]
<environment: namespace:DOSE>
geneList
<environment: namespace:DOSE>
geneList
geneSet
exponent
fortify
<environment: namespace:base>
x
row.names
optional
...
nm
width.cutoff
collapse
<environment: namespace:base>
x
name
value
<environment: namespace:base>
cpu
elapsed
transient
<environment: namespace:base>
expr
...
<environment: namespace:DOSE>
geneList
geneSets
exponent
[1]
i
[2]
<environment: namespace:DOSE>
geneList
<environment: namespace:DOSE>
geneList
geneSet
exponent
fortify
<environment: namespace:base>
x
row.names
optional
...
nm
width.cutoff
collapse
<environment: namespace:base>
x
name
value
calculating p values...
leading edge analysis...
done...
> head(x,2)
ID Description setSize enrichmentScore
DOID:5679 DOID:5679 retinal disease 299 -0.3676313
DOID:0060040 DOID:0060040 pervasive developmental disorder 175 -0.3909104
NES pvalue p.adjust qvalues rank
DOID:5679 -1.575727 0.001288660 0.03595402 0.02395644 1768
DOID:0060040 -1.570647 0.001404494 0.03595402 0.02395644 2308
leading_edge
DOID:5679 tags=24%, list=14%, signal=21%
DOID:0060040 tags=30%, list=18%, signal=25%
core_enrichment
DOID:5679 2878/3791/23247/80184/6750/7450/596/9187/2034/482/948/1490/1280/5737/4314/4881/3426/187/629/6403/6785/2934/5176/7078/5950/727/10516/4311/2247/1295/358/10203/582/10218/57125/585/1675/6310/2202/4313/2944/4254/3075/2099/3480/4653/6387/1471/857/4016/1909/4053/6678/1296/4915/55812/1191/5654/10631/2697/2952/6935/2200/3479/2006/10451/9370/771/652/4693/5346/1524
DOID:0060040 1760/9732/7337/5175/6532/54806/6326/1499/7157/221037/627/3399/2571/3082/3791/27347/596/22829/23426/324/5021/4885/7248/8604/3397/4208/3400/26470/64221/9037/3952/93664/2944/7102/2550/4915/4922/26960/1746/2697/6863/2891/367/4128/7166/6505/5348/18/9370/57502/79083
>
Now I only use:
https://github.com/GuangchuangYu/DOSE/blob/master/R/gsea.R#L217-L221
permScores <- bplapply(1:nPerm, function(i) {
if (seed)
set.seed(seeds[i])
perm.gseaEScore(geneList, selected.gs, exponent)
})
and let bpparam()
to select proper BPPARAM
(registered()[[1]]
).
I can't reproduce the unexpected message
on Mac OSX
.
@mtmorgan Is this an issue of BiocParallel
on Windows
?
To my knowledge, it may due to importing function improperly. This is quite likely, as BiocParallel
use ::
and :::
quite often.
~/g/o/B/R ❯❯❯ cat * |grep '::'
prev.config <- BatchJobs::getConfig()
on.exit(do.call(BatchJobs::setConfig, prev.config))
new.conf <- unclass(do.call(BatchJobs::setConfig,
setMethod("bpbackend", "BatchJobsParam", function(x) BatchJobs::getConfig())
prev.config <- BatchJobs::getConfig()
on.exit(BatchJobs::setConfig(conf=prev.config), add=TRUE)
BatchJobs::setConfig(conf=BPPARAM$conf.pars)
reg <- do.call(BatchJobs::makeRegistry, reg.pars)
ids <- BatchJobs::batchMap(
} else BBmisc::chunk(ids, n.chunks=bpnworkers(BPPARAM), shuffle=TRUE)
do.call(BatchJobs::submitJobs, submit.pars)
BatchJobs::waitForJobs(reg, ids, timeout=30L * 24L * 60L * 60L,
BatchJobs::loadResults(reg, ids, use.names="none")
foreach::getDoParWorkers()
isNamespaceLoaded("foreach") && foreach::getDoParRegistered() &&
(foreach::getDoParName() != "doSEQ") &&
(foreach::getDoParWorkers() > 1L)
`%dopar%` <- foreach::`%dopar%`
foreach::foreach(X=X, .errorhandling=handle) %dopar% FUN(X, ...)
max(1L, parallel::detectCores() - 2L)
cores <- max(1L, parallel::detectCores() - 2L)
bpbackend(x) <- do.call(parallel::makeCluster, cargs)
parallel::clusterSetRNGStream(bpbackend(x), bpRNGseed(x))
parallel::clusterExport(bpbackend(x), "timeout", env=environment())
parallel::stopCluster(bpbackend(x))
BatchJobs:::checkDir(bpresultdir(BPPARAM))
BatchJobs:::checkDir(bplogdir(BPPARAM))
BatchJobs:::checkDir(bpresultdir(BPPARAM))
BatchJobs:::checkDir(bplogdir(BPPARAM))
### parallel::SOCKcluster types
parallel:::stopCluster.default
### snow::MPI
Rmpi::mpi.comm.get.parent(intercomm)
Rmpi::mpi.intercomm.merge(intercomm,1,comm)
Rmpi::mpi.comm.set.errhandler(comm)
Rmpi::mpi.comm.disconnect(intercomm)
bploop(snow::makeMPImaster(comm))
Rmpi::mpi.comm.disconnect(comm)
Rmpi::mpi.quit()
### parallel::FORK
parallel::mcparallel({
parallel::clusterApply(cl, seq_along(cl), .bufferload,
parallel:::sendData(node, data)
parallel:::sendData(node, data)
parallel:::recvData(node)
parallel:::recvOneData(cluster)
parallel:::closeNode(node)
## bploop.lapply(): derived from snow::dynamicClusterApply.
## Derived from snow::dynamicClusterApply and parallel::mclapply.
unknown <- codetools::findGlobals(fun)
futile.logger::flog.threshold(threshold)
futile.logger::flog.info("loading futile.logger package")
futile.logger::flog.appender(.log_buffer_append, 'ROOT')
futile.logger::flog.warn(fmt, ...)
futile.logger::flog.error(fmt, ...)
## derived from plyr::progress_text()
multicore <- (parallel::detectCores() - 2L) > 1L
~/g/o/B/R ❯❯❯ cat * |grep ':::'
BatchJobs:::checkDir(bpresultdir(BPPARAM))
BatchJobs:::checkDir(bplogdir(BPPARAM))
BatchJobs:::checkDir(bpresultdir(BPPARAM))
BatchJobs:::checkDir(bplogdir(BPPARAM))
parallel:::stopCluster.default
parallel:::sendData(node, data)
parallel:::sendData(node, data)
parallel:::recvData(node)
parallel:::recvOneData(cluster)
parallel:::closeNode(node)
If you register(SnowParam())
you'll be getting the same backend as on Windows, this might help to emulate the problem Guido is reporting. The output looks something like ls() or ??? It could be some left-over de-bugging code in BiocParallel, but I'm not able to reproduce it outside of DOSE.
For the archive: I noticed this issue has been resolved in DOSE v3.0.4, which also was confirmed by Guangchuang here: https://github.com/GuangchuangYu/ReactomePA/issues/11#issuecomment-260289422
I found a bug when running gseDO(). Reproducible example (from https://guangchuangyu.github.io/2016/07/leading-edge-analysis/ , except that I did not use the default "by='fgsea'").
Thanks for having a look at this. G