lh3 / bwa

Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
GNU General Public License v3.0
1.51k stars 553 forks source link

Deletion at the end of the alignment #280

Open tprodanov opened 4 years ago

tprodanov commented 4 years ago

Secondary alignment has a deletion at the end of its cigar. Ways to reproduce (using human genome version 19):

samtools faidx hg19.fa 7:157673851-157674750 > 1.fa
bwa mem -a -k 20 hg19.fa 1.fa

# Output sam file:
7:157673851-157674750   0       7       157673851       60      900M    ...
7:157673851-157674750   256     7       157673861       0       10H890M70D      ...

I am not sure if deletion at the end is an invalid operation, but it is confusing. Also there may be bugs in other programs if they expect last operation to be either match or clipping.

I used BWA version 0.7.17-r1188.

d-cameron commented 4 years ago

The deletion can also be at the start of the cigar. Be aware that bwa defines the alignment start as the position of the start of the deletion, whereas the SAM specifications defines the alignment start as the position of the first mapped based (ie, the position of the first M base next to the D).