lcolladotor / derfinder

Annotation-agnostic differential expression analysis of RNA-seq data via expressed regions-level or single base-level approaches
http://lcolladotor.github.io/derfinder
42 stars 15 forks source link

BiocParallel errors when running regionMat() #45

Open rosshandler opened 10 months ago

rosshandler commented 10 months ago

Hello,

Thank you for making this tool. I would like to use derfinder starting from a set of bam files. I have loaded the files using load coverage and it ran without problems:

lC <- loadCoverage( files=files, chr="chr21", cutoff = NULL, filter = "one", chrlen = NULL, output = NULL, bai = NULL )

I then tried to run regionMat without any parallel specification but still I get an error referring to BiocParallel:

regionMat <- regionMatrix( fullCov=lC, cutoff = 5, L = rep(500,length(files)), totalMapped = 8e+07, targetSize = 8e+07, runFilter = TRUE, returnBP = TRUE )

By using totalMapped equal to targetSize, regionMatrix() assumes that you have normalized the data already in fullCoverage(), loadCoverage() or filterData(). 2023-10-27 02:32:30 regionMatrix: processing coverage 2023-10-27 02:32:30 filterData: normalizing coverage 2023-10-27 02:32:31 filterData: done normalizing coverage 2023-10-27 02:32:40 filterData: originally there were 48129895 rows, now there are 3229037 rows. Meaning that 93.29 percent was filtered. 2023-10-27 02:32:40 findRegions: identifying potential segments 2023-10-27 02:32:40 findRegions: segmenting information 2023-10-27 02:32:40 findRegions: identifying candidate regions extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf 2023-10-27 02:32:40 findRegions: identifying region clusters extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf 2023-10-27 02:32:41 getRegionCoverage: processing coverage 2023-10-27 02:32:45 getRegionCoverage: done processing coverage 2023-10-27 02:32:45 regionMatrix: calculating coverageMatrix 2023-10-27 02:32:48 regionMatrix: adjusting coverageMatrix for 'L' 2023-10-27 02:32:48 regionMatrix: processing position 2023-10-27 02:32:48 filterData: normalizing coverage Error: BiocParallel errors 1 remote errors, element index: 2 0 unevaluated and other errors first remote error: zero-length inputs cannot be mixed with those of non-zero length

Are you familiar with this error? If there is other info I should provide, please let me know.

All the best, Ivan

rosshandler commented 10 months ago

Hi again, I have a brief update. If I run the function loadCoverage with filtered > 0. Then I have a different error, but at the same place:

It does not change whether I change any of this parameters.

regionMat <- regionMatrix( fullCov=lC, cutoff = 5, L = 500, totalMapped = 8e+07, targetSize = 8e+07, runFilter = TRUE, returnBP = TRUE ) By using totalMapped equal to targetSize, regionMatrix() assumes that you have normalized the data already in fullCoverage(), loadCoverage() or filterData(). 2023-10-27 22:13:42 regionMatrix: processing coverage 2023-10-27 22:13:42 filterData: normalizing coverage 2023-10-27 22:13:43 filterData: done normalizing coverage 2023-10-27 22:13:53 filterData: originally there were 14282639 rows, now there are 3229037 rows. Meaning that 77.39 percent was filtered. 2023-10-27 22:13:53 findRegions: identifying potential segments 2023-10-27 22:13:53 findRegions: segmenting information 2023-10-27 22:13:54 findRegions: identifying candidate regions extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf 2023-10-27 22:13:54 findRegions: identifying region clusters extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf 2023-10-27 22:13:55 getRegionCoverage: processing coverage 2023-10-27 22:14:02 getRegionCoverage: done processing coverage 2023-10-27 22:14:02 regionMatrix: calculating coverageMatrix 2023-10-27 22:14:05 regionMatrix: adjusting coverageMatrix for 'L' 2023-10-27 22:14:05 regionMatrix: processing position 2023-10-27 22:14:05 filterData: normalizing coverage Error: BiocParallel errors 1 remote errors, element index: 2 0 unevaluated and other errors first remote error: this S4 class is not subsettable