wdecoster / cramino

A *fast* tool for BAM/CRAM quality evaluation, intended for long reads
MIT License
127 stars 11 forks source link

Mean identity calculated as more than 100% #7

Closed danrdanny closed 2 years ago

danrdanny commented 2 years ago

I'm seeing several cases where mean identity is calculated as over 100%. For example:

Yield [Gb]  39.86
Mean coverage   12.86
N50 21602
Median length   1600.00
Mean length 5308
Median identity 97.98
Mean identity   103.01

I wouldn't expect this to be possible, correct?

wdecoster commented 2 years ago

I agree that shouldn't be possible. Which aligner was used, and which version? Cramino uses the de tag from minimap2, if available.

danrdanny commented 2 years ago

Minimap2, 2.17-r941.

wdecoster commented 2 years ago

Most likely a known bug then: https://github.com/lh3/minimap2/releases/tag/v2.18

danrdanny commented 2 years ago

Ah, good catch. I'm super excited to re-align my data with a new version. :)

wdecoster commented 2 years ago

Since I was also using v2.17 alignments I dismissed similar results I saw in my tests, but there was something more going on which is now solved by using f64s rather than f32s for the read identity. Version 0.9.4 has this fix, and I'll make a new release soon.