mengyao / Complete-Striped-Smith-Waterman-Library

294 stars 112 forks source link

SSW fails to align if read and reference are switched #63

Open vishnubob opened 5 years ago

vishnubob commented 5 years ago

I have encountered an issue whereby SSW gives different results depending on the association of sequences as a read or as a reference. Ie. when one of the sequences is a read and the other is a reference, I get a proper alignment. When their association is reversed, I get no alignment. I've written a small C program that can replicate this behavior, and I've tested this with multiple versions of SSW only to get the same result. Example program, compile command, and subsequent output are below. You can download bug.c from here.

int main (int argc, char * const argv[]) { int32_t gap_open = 3, gap_extension = 1;

const int8_t matrix[] = { 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, 2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, 2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, 2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, 2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, 2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, -2, 2, -2, 2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, -2, -2, 2, 2, 2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, -2, 2, 2, -2, 2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, -2, 2, -2, 2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, -2, 2, 2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2, -2, 2, 2, 2, 2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, 2 };

const int8_t seq_a[] = { 0, 1, 2, 1, 0, 3, 2, 0, 2, 1, 3 };

const int8_t seq_b[] = { 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };

s_profile* profile;
s_align* result;

printf("read = seq_a; ref = seq_b\n\n");
profile = ssw_init(seq_a, 11, matrix, 16, 2);
result = ssw_align(profile, seq_b, 11, gap_open, gap_extension, 1, 0, 0, 15);
printf("(score1, score2) = (%d, %d)\n", result->score1, result->score2);
printf("(ref_begin1, ref_end1) = (%d, %d)\n", result->ref_begin1, result->ref_end1);
printf("(read_begin1, read_end1) = (%d, %d)\n", result->read_begin1, result->read_end1);

align_destroy(result);
init_destroy(profile);

printf("\nread = seq_b; ref = seq_a\n\n");

profile = ssw_init(seq_b, 11, matrix, 16, 2);
result = ssw_align(profile, seq_a, 11, gap_open, gap_extension, 1, 0, 0, 15);
printf("(score1, score2) = (%d, %d)\n", result->score1, result->score2);
printf("(ref_begin1, ref_end1) = (%d, %d)\n", result->ref_begin1, result->ref_end1);
printf("(read_begin1, read_end1) = (%d, %d)\n", result->read_begin1, result->read_end1);

align_destroy(result);
init_destroy(profile);

return(0);

}


* compile command:
```gcc -o bug ssw.o bug.c -Wall -pipe -O2 -lm -lz```

* output:
```read = seq_a; ref = seq_b

(score1, score2) = (22, 0)
(ref_begin1, ref_end1) = (0, 10)
(read_begin1, read_end1) = (0, 10)

read = seq_b; ref = seq_a

(score1, score2) = (0, 0)
(ref_begin1, ref_end1) = (-1, -1)
(read_begin1, read_end1) = (0, 0)

Thanks!

mengyao commented 2 years ago

This is very interesting. Please let me know, if you have any idea of fixing. Thank you.