vplagnol / ExomeDepth

ExomeDepth R package for the detection of copy number variants in exomes and gene panels using high throughput DNA sequencing data.
63 stars 26 forks source link

Counts from CRAM files #17

Open ghost opened 5 years ago

ghost commented 5 years ago

Hi,

Just a suggestion for a future update of the package. It would be nice if there would exist a function that could generate read counts directly from CRAM files. For us, it would be a great help as now we have to convert our CRAM files back to BAM files to run ExomeDepth.

Thank you for considering it.

All the best, Toon

fgvieira commented 5 years ago

Yes, I completely agree. Is there a plan to support CRAM anytime soon?

vplagnol commented 4 years ago

Long standing issue, but looking into this now as I refresh the package. Do you know if Bioconductor has written tools to parse CRAM files? I see no mention in Rsamtools, but that surprises me. If someone has done the work it should be easy to use the parser in the ExomeDepth package.

ghost commented 4 years ago

I don't know about the existence of such a CRAM parser in Bioconductor at the moment. Hopefully this will be available soon.

katherinef commented 4 years ago

Hi, Does ExomeDepth (or DECoN) support CRAM files yet? If not, is there a way to go from the counts generated using samtools (ie. skip the first step in the ExomeDepth pipeline)? Thanks!

tomiles commented 3 years ago

Sorry to bump this issue again, but is there any progress on cram support? would be nice to not have to do a cram to bam conversion just to run ExomeDepth. Thanks

tomiles commented 3 years ago

Do you know if Bioconductor has written tools to parse CRAM files? I see no mention in Rsamtools, but that surprises me. If someone has done the work it should be easy to use the parser in the ExomeDepth package.

I think you could use Rhtslib -> https://www.bioconductor.org/packages/release/bioc/html/Rhtslib.html

pd3 commented 1 year ago

Uh, just ran into the same issue. Samtools support CRAM files natively, so it should be trivial to make Rsamtools accept CRAMs as well. Although I understand we are barking at the wrong tree here...

vplagnol commented 1 year ago

Thanks all- FWIW I'd be happy to add CRAM support if the underlying Rsamtools library was upgraded, it's clearly a very nice to /must have. Anything else would be painful (and probably not the right package to do so). But if that happens in Rsamtools, do ping me and I'll add whatever extra feature is required.

pd3 commented 1 year ago

You could call samtools bedcov directly from your package. That's what I'll have to end up doing in my pipeline I am afraid

vplagnol commented 1 year ago

I am thinking as I am writing here, so apologies for any silliness. Packaging the samtools binary seems painful because it's an external binary, and creates all sort of issues if not set up as a R library. What I think I could do is enable receiving a tabular file with depth and positions (didn't I code that already?). If I had that option, you could run samtools bedcov on your samples and feed the resulting table into exomedepth, no?

pd3 commented 1 year ago

Yes, that's what I am doing

samtools bedcov -c file1.cram file2.cram file3.cram | cut -f1-3,7- | gzip -c > bedcov.txt.gz

Don't know yet what is the proper way to feed the result to ExomeDepth, I hope it will be as simple as

counts <- read.table('bedcov.txt.gz')
vplagnol commented 1 year ago

Yes from the vignette I wrote ages ago I see getBAMCounts creates an object of the GRanges class which can easily be converted into a matrix or a data frame (which is the input format for ExomeDepth). so that should be just fine. Let me know of that's not the case.