BenLangmead / bowtie2

A fast and sensitive gapped read aligner
GNU General Public License v3.0
638 stars 160 forks source link

Different alignments in concatenated input vs comma-separated input #443

Open jayrbolton opened 10 months ago

jayrbolton commented 10 months ago

Say I have three fasta reference files, and I first concatenate them together with something like cat 1.fna 2.fna 3.fna > reference.fna, and then I run bowtie2-build reference.fna output.

In contrast, I could pass them directly as comma-separated files with bowtie2-build 1.fna,2.fna,3.fna output.

I'm seeing different alignment results between these two dbs (that is, for the comma-separated version, I get slightly more alignments for one example I'm looking at). Is this expected behavior?

jayrbolton commented 10 months ago

I found this both with 2.4.5 and 2.5.1. I'm going to try more parameter combinations (gzipped vs not, ordering of references) to make sure I'm not attributing the wrong cause.

ch4rr0 commented 10 months ago

Hello,

Are the checksums (MD5 or SHA) of the indexes the same?