BenLangmead / bowtie

An ultrafast memory-efficient short read aligner
Other
257 stars 76 forks source link

bowtie reports incorrect result when using bowtie2 index #103

Closed qifei9 closed 2 years ago

qifei9 commented 4 years ago

The reference and a read for test is in test.zip. bowtie version 1.2.3, bowtie2 version 2.3.5.1, linux x86_64.

The command is

bowtie -p 7 -n 0 -l 10 -k 1 --best -S --no-unal --norc test ./t.fq t.sam

When I build the index by bowtie-build, the result is

ST-E00205:927:H7M7KCCX2:1:1223:13535:59182      0       b       58      255     100M    *       0       0       TTTGAGTTTTCAACAATGATGGACTAAGTGTCAGGACACACTCCTTGGATCTCAAACTCGTCATCTCCAAACTAACGATGTGAGACTCCATTTCTCCATT   <FAJ7FFAJAAJ<F-F<AAJF<JF7<A<AJAFFFJ<<JJFFJJAAFJJFJFAJFFFAA-FA<<JJAJAA7FF7F<FJJFJFFJFFF<7A7-<JFF<FJJA    XA:i:0  MD:Z:100        NM:i:0  XM:i:2

It's a 100% identical alignment.

However, if I build the index by bowtie2-build (version 2.3.5.1), the result will be

ST-E00205:927:H7M7KCCX2:1:1223:13535:59182      0       n       58      255     100M    *       0       0       TTTGAGTTTTCAACAATGATGGACTAAGTGTCAGGACACACTCCTTGGATCTCAAACTCGTCATCTCCAAACTAACGATGTGAGACTCCATTTCTCCATT   <FAJ7FFAJAAJ<F-F<AAJF<JF7<A<AJAFFFJ<<JJFFJJAAFJJFJFAJFFFAA-FA<<JJAJAA7FF7F<FJJFJFFJFFF<7A7-<JFF<FJJA    XA:i:0  MD:Z:100        NM:i:0  XM:i:2

The alignment is

CATGCCTGCTGAACAATGATGGACTAAGTGTCAGGACACACTCCTTGGATCTCAAACTCGTCATCTCCAAACTAACGATGTGAGACTCCATTTCTCCATT
  ||  || | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TTTGAGTTTTCAACAATGATGGACTAAGTGTCAGGACACACTCCTTGGATCTCAAACTCGTCATCTCCAAACTAACGATGTGAGACTCCATTTCTCCATT

which does not fit the -n 0 -l 10 parameter.

ch4rr0 commented 4 years ago

Bowtie seems to be finding the correct alignment, but is somehow reporting the wrong reference name (should be b, not n). The SAM flag, NM:i:0, also supports this since bowtie would report a value of 7 if it aligned against reference n.

I have not found the exact cause of this issue, but it maybe tied to the --best parameter since we obtain the correct output with this flag omitted.

ch4rr0 commented 4 years ago

I took another shot at this issue and may have found a potential solution. I am not certain whether it scales well to many such reads, but would appreciate if you can test and let me know.

ch4rr0 commented 2 years ago

This issue was resolved in v1.3.0.