mengyao / Complete-Striped-Smith-Waterman-Library

301 stars 115 forks source link

Wrong protein alignment but correct score #61

Closed martin-steinegger closed 2 years ago

martin-steinegger commented 5 years ago

I tried to align two sequences query WSAPSVLLNAS and target WHSSPSILLNS using the following command.

 ssw_test -c -p target.fas query.fas 

The resulting score is 56 according to ssw.

optimal_alignment_score: 56 strand: +   target_begin: 1 target_end: 11  query_begin: 1  query_end: 11

Target:        1    WHSSPSILL-N    10
                    |******|* *
Query:         1    WSAPSVLLNAS    11

However if I try to rescore the alignment than I end up with a score of 11

WW    HS     SA    SP    PS    SV   IL    LL    LN    -   NS
15 +  -1 +   1  +  -1 +  -1  + -2 +  2  +  5 +  -4 + -3 + 1  = 12

The correct alignment should look like this

optimal_alignment_score: 56 strand: +   target_begin: 1 target_end: 11  query_begin: 1  query_end: 11

Target:          1 WHSSPSILLN-S      11
                   | |*||*||| |
Query:           1 W-SAPSVLLNAS      11

The error occurs if there is a single match followed by a deletion at the beginning of the alignment. The banded_sw function produces the error.

martin-steinegger commented 5 years ago

The fix is to change line https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library/blob/cabdab1ba32ca66fa6ed730999cce857e02bebcc/src/ssw.c#L677 to while (LIKELY(i > 0) || LIKELY(j > 0))