Open xiuru opened 4 years ago
I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:
single = MulticoreParam(workers=1, progressbar=TRUE) dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)
@haowulab Thanks for your suggestion. Single core works well for my data, maybe there are something wrong for my BiocParallel package. I will reinstall BiocParallel and try multi core for DMLtest.
Thanks!
I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do:
single = MulticoreParam(workers=1, progressbar=TRUE) dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)
I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: https://github.com/Bioconductor/BiocParallel/issues/106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?
I don't know. Can you can other BiocParallel codes in ubuntu?
I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do: single = MulticoreParam(workers=1, progressbar=TRUE) dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)
I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: Bioconductor/BiocParallel#106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?
It seems that I have the same problem. The progress bar stay 0% for hours, even 50 or 80 threads are running. btw, I am using CentOS 7.
I can't tell from this. But there seems to be a warning msg in parallel computing part. Can you try to use single core? Do: single = MulticoreParam(workers=1, progressbar=TRUE) dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE, BPPARAM=single)
I have an issue running DMLtest with more than single core. The progress bar will just stay at 0% for an hour+ when using anything more than a single core. I'm working with human genome and single core takes several hours just comparing 2 samples, when in reality I want to compare several more samples. I've tried re-installing BiocParallel to no avail. I am running R v4.1.0 on ubuntu 20.04.2. Is this a problem specific to ubuntu parallelization? I saw this issue: Bioconductor/BiocParallel#106. I cannot figure out how to troubleshoot for DSS unfortunately. Any thoughts?
It seems that I have the same problem. The progress bar stay 0% for hours, even 50 or 80 threads are running. btw, I am using CentOS 7.
I really can't tell. Are you using a desktop computer running ubuntu? There might be problems running biocparallel on a hpc cluster with a scheduler such as SGE. Can you run other codes using biocparallel?
IDK if this is from upstream (BiocParallel) or not. Yet, I'm experiencing an issue that seems related to this. Using the example code from DMLtest help, I see that multiple core(s) is much slower than a single core on RStudio Server.
> mParam = MulticoreParam(workers=128, progressbar=TRUE)
> timestamp(); dmlTest1 <- DMLtest(BSobj, group1=c("C1", "C2"), group2=c("N1", "N2"), BPPARAM=mParam); timestamp()
##------ Wed Mar 30 10:13:40 2022 ------##
Estimating dispersion for each CpG site, this will take a while ...
|=======================================================| 100%
|===========================================| 100%
##------ Wed Mar 30 10:46:05 2022 ------##
> timestamp(); dmlTest1 <- DMLtest(BSobj, group1=c("C1", "C2"), group2=c("N1", "N2"), BPPARAM=single); timestamp()
##------ Wed Mar 30 10:51:37 2022 ------##
Estimating dispersion for each CpG site, this will take a while ...
|===========================================| 100%
|===========================================| 100%
##------ Wed Mar 30 10:51:46 2022 ------##
To all users experiencing problems with parallel computing:
DSS used to use BiocParallel for parallel computing. However, some recent changes in BiocParallel makes it very slow. I asked on bioc website but nobody replied. You can see my post at https://support.bioconductor.org/p/9140528/ and try the codes there.
I modified DSS to use another package. You can see some description at http://www.bioconductor.org/packages/devel/bioc/vignettes/DSS/inst/doc/DSS.html#331_Parallel_computing_for_DMLDMR_detection_from_two-group_comparison.
The new package is available as “development” version at http://www.bioconductor.org/packages/devel/bioc/html/DSS.html. Bioc has only two releases every year, so the changes won’t appear in the “official” package maybe until summer. Anyway, you can install the devel version and try.
Hao
I commented in the support thread which lead to opening an issue in BiocParallel: https://github.com/Bioconductor/BiocParallel/issues/238
The behaviour might change but the solution using BiocParallel seems to be usingforce.GC=FALSE
inside bplapply
. Hopefully this will get fixed before the next release as current parallel solution doesn't work in windows.
Hello, I am using DSS to detect DML for WGBS data and got an error when performing statistical test for DML with smoothing. My code:
dat1.1 = read.table("chr1-ZmBS-BS1-1-CpG.bismark.cov.tsv", header=TRUE)
dat1.2 = read.table("chr1-ZmBS-BS2-1-CpG.bismark.cov.tsv", header=TRUE)
dat2.1 = read.table("chr1-ZmMC-BS1-1-CpG.bismark.cov.tsv", header=TRUE)
dat2.2 = read.table("chr1-ZmMC-BS2-1-CpG.bismark.cov.tsv", header=TRUE)
BSobj = makeBSseqData( list(dat1.1, dat1.2, dat2.1, dat2.2),c("BS1","BS2", "MC1", "MC2") )
dmlTest.sm = DMLtest(BSobj, group1=c("BS1", "BS2"), group2=c("MC1", "MC2"),smoothing=TRUE)
But got an error like: Smoothing ... Estimating dispersion for each CpG site, this will take a while ... |======================================================================| 100%
| | 0%Error in result[[njob]] <- value : attempt to select less than one element in OneIndex In addition: Warning message: In parallel::mccollect(wait = FALSE, timeout = 1) : 1 parallel job did not deliver a result
When i try to test the first 20000 lines of those 4 files, DMLtest works fine with no error, it seems something wrong in my original files. Do you have any ideas on how to avoid this?
Thank you in advance!