fmicompbio / QuasR

This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest.
https://fmicompbio.github.io/QuasR/
4 stars 3 forks source link

Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") #18

Closed GuidoBarzaghi closed 4 years ago

GuidoBarzaghi commented 4 years ago

Hello!

Sorry to disturb, I have been encountering this error lately while using the qQCreport function and I cannot seem to figure out what is the problem there. I hope you guys can give a hand.

Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") : In range 24: at least two out of 'start', 'end', and 'width', must be supplied.

Here is my few lines of code prj <- qAlign(sampleFile=INPUT_FILE_NAME, genome="BSgenome.Mmusculus.UCSC.mm10", paired="fr", bisulfite="undir", projectName="prj", alignmentsDir=MyDir, checkOnly=F) cl = makeCluster(2) qQCReport(input = prj, pdfFilename = "whatever.pdf", clObj = cl)

where INPUT_FILE_NAME looks as the following FileName SampleName /path/to/file/Sorted_Indexed_file.bam Sorted_Indexed_file

The error occurs after the function qQCreport outputs

collecting quality control data

I am running QuasR_1.26.0 on R 3.6.2.

Thanks is advance for your help!

mbstadler commented 4 years ago

Hello @GuidoBarzaghi

One context in which we saw such an error was when the bam files were created outside of QuasR, aligning to a reference genome that is not identical to BSgenome.Mmusculus.UCSC.mm10. The new release of QuasR (1.28.0) will check for that and give a more informative error message, but it will only be released with the new Bioconductor (3.11) end of April.

Could that be the case here, too? If so, things should work as expected if you use the same genome for qAlign that you also used for alignment, e.g. by qAlign(..., genome = "/path/to/your/genome.fa").

I hope that helps, Michael

GuidoBarzaghi commented 4 years ago

Thanks for your very quick reply Micheal!

Your assessment is absolutely correct, I am indeed aligning with an external tool. I will change the genome parameter.

GuidoBarzaghi commented 4 years ago

Hello Michael!

Sorry for reopening the issue. I've tried to use the genome I used to align but I am getting the same error.

To give you a little context, the data I'm dealing with is Bisulfite sequencing data so the things that I have tried now is to feed either the reference genome I originally used to create the Bisulfite converted genome or any of the converted ones. I all cases I am getting the same error as above. Do you have any further suggestions?

Thanks, Guido

mbstadler commented 4 years ago

Hi Guido

Thank you for reporting. Would it be possible for you to create a minimal example that reproduces the error, for instance using a subset of reads and genome that you can share with me, so I can dig deeper?

I would also be curious to hear if the issue also arises if you create alignments from within QuasR which does support bisulfite converted reads (and does not require you to create converted genomes manually), and also what happens with QuasR 1.27.1 or newer (currently in BioC-devel, planned to be release on April 28 with BioC 3.11), which specifically tests for issues that may trigger this error.

Michael

mbstadler commented 4 years ago

Hi Guido

We have been in contact offline and now we know where the error came from. I am reporting it back here in case somebody else has a simmilar issue.

The error is indeed caused by inconsistent genomes used in external alignment and QuasR::qAlign.

This can be easily tested from within R. The code below must give a TRUE:

library(Rsamtools)
sl.ref <- seqlengths(FaFile("/path/to/genome.fa"))
sl.bam <- seqlengths(BamFile("/path/to/alignments.bam"))
identical(sl.ref, sl.bam)

Recent versions of QuasR::qQCReport (1.27.1 or newer) explicitely check this and will throw an error in such cases:

The chromosome names/lengths in the specified genome do not match the ones in the provided bam files

I'll close the issue again.