So it reports the alignment to seq2, even though the sequence is identical to seq3 except for the extra 'T' at the beginning. Changing the reference to this order:
results in reporting the alignment to seq1 instead. Removing the preceding 'T' from the beginning of the query sequence results in the expected alignment to seq3.
It seems the extra base at the beginning of the query somehow results in reporting the alignment as whichever sequence is before the correct one in the reference fasta.
Given the following reference:
and the following unaligned bam file:
running bwa aln and samse:
produces the following alignment:
So it reports the alignment to seq2, even though the sequence is identical to seq3 except for the extra 'T' at the beginning. Changing the reference to this order:
results in reporting the alignment to seq1 instead. Removing the preceding 'T' from the beginning of the query sequence results in the expected alignment to seq3.
It seems the extra base at the beginning of the query somehow results in reporting the alignment as whichever sequence is before the correct one in the reference fasta.