BenLangmead / bowtie2

A fast and sensitive gapped read aligner
GNU General Public License v3.0
639 stars 159 forks source link

2.5.0 does not produce identical output to 2.4.5; alignment rate is substantially lower #420

Closed eboyden closed 8 months ago

eboyden commented 1 year ago

When running an identical command on the same input, the alignment rate is substantially lower with 2.5.0 than with 2.4.5:

/bowtie2/bowtie2-2.4.5-linux-x86_64/bowtie2 -p 16 --very-sensitive --no-discordant --no-mixed --sam-append-comment -x /bowtie2/index/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set_masked_t14 --interleaved fastq.gz

10374067 reads; of these:
  10374067 (100.00%) were paired; of these:
    175112 (1.69%) aligned concordantly 0 times
    9702009 (93.52%) aligned concordantly exactly 1 time
    496946 (4.79%) aligned concordantly >1 times
98.31% overall alignment rate

/bowtie2/bowtie2-2.5.0-linux-x86_64/bowtie2 -p 16 --very-sensitive --no-discordant --no-mixed --sam-append-comment -x /bowtie2/index/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set_masked_t14 --interleaved fastq.gz

10374067 reads; of these:
  10374067 (100.00%) were paired; of these:
    571245 (5.51%) aligned concordantly 0 times
    9160934 (88.31%) aligned concordantly exactly 1 time
    641888 (6.19%) aligned concordantly >1 times
94.49% overall alignment rate
ch4rr0 commented 1 year ago

Hello,

I pushed a change to the bug_fixes that should resolve this issue. Please confirm whether the change does in-fact resolve the issue, so that I can have a release out ASAP.

eboyden commented 1 year ago
/bowtie2/bowtie2-bug_fixes/bowtie2 -p 16 --very-sensitive --no-discordant --no-mixed --sam-append-comment -x /bowtie2/index/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set_masked_t14 --interleaved fastq.gz

10374067 reads; of these:
  10374067 (100.00%) were paired; of these:
    175112 (1.69%) aligned concordantly 0 times
    9700465 (93.51%) aligned concordantly exactly 1 time
    498490 (4.81%) aligned concordantly >1 times
98.31% overall alignment rate

Looks fixed to me. The total aligned and unaligned read counts are the same, and the mean target coverage is exactly identical to that using 2.4.5, but I did notice that the number of "unique" and the number of "non-unique" alignments reported are slightly different. I assume this is intended, as described in the 2.5.0 release notes: Changed the way that unique alignments are counted in summary message to better match up with filters on SAM output?

Thanks @ch4rr0, as always I appreciate how responsive you are and how well supported this software is!

ch4rr0 commented 1 year ago

Thank you for the kind words.

I assume this is intended, as described in the 2.5.0 release notes: Changed the way that unique alignments are counted in summary message to better match up with filters on SAM output?

That is correct.