Closed mattloose closed 1 year ago
Hi Matt,
I don't expect that a dorado bam would cause a problem, but I haven't tried. And indeed, cramino works (by default) with the primary and supplementary alignments, not the secondary ones or unaligned reads. This can be changed with --ubam, added relatively recently. When you say incorrect, do you mean by a lot, and is this compared with the metrics based on fastq? I can imagine that differences can happen due to softclipping adapters, for example.
Wouter
Thanks for the prompt reponse.
It's out by a lot here...
This is the output of trusty NanoStat on the fastq extracted from the BAM file using samtools:
General summary: Mean read length: 36,689.1 Mean read quality: 12.3 Median read length: 7,677.0 Median read quality: 18.5 Number of reads: 708,681.0 Read length N50: 129,276.0 STDEV read length: 64,679.2 Total bases: 26,000,838,396.0
This is the relevant output from cramino (and also NanoPlot) from the aligned bam file:
Number of reads 878325 Yield [Gb] 57.81 Mean coverage 18.72 Yield [Gb] (>25kb) 54.63 N50 182736 N75 112611 Median length 16377.00 Mean length 65823 Median identity 98.63 Mean identity 94.89
The NanoStat from the fastq is in line with what I expect to see.
Cheers
Matt
An update - having done some digging I believe that dorado is outputting soft clipped supplementary alignments which leads to tools miscalculating lengths if they assume only hard clipping is in use. Using hard clipping will cause downstream issues with methylation tags.
This will be fixed in the next release, tentatively this weekend.
Hey Wouter - any update on this?
THanks
Matt
Pushing a release today, thanks for your patience. Softclips will be ignored now.
Hi Wouter,
Have you run this across an aligned BAM file as output by dorado? It seerms to me that the calculated data is reflecting alignments and not the underlying read data - for example Yield and N50 calculations are incorrect as is Mean coverage etc.
This might be an issue with the dorado BAM file but I think it may also be that you are really looking at alignments?
Cheers
Matt