Open threadless-screw opened 5 years ago
I have unfortunately not gotten to the bottom of the issue. As far as I can tell, the re-use of the sw
and result
variables [see sw_cmdline.c] when iterating over the multiple sequences in the files is at the heart of it; it seems a previous iteration leaves its traces into the next. Practically, the problem can be overcome by wrapping each align()
function call in its own initialization/destruction routines, like so:
sw = smith_waterman_new();
result = alignment_create(256);
align( ....);
smith_waterman_free(sw);
alignment_free(result);
Since this somewhat wasteful solution is no more than a patch that leaves the underlying issue unresolved, I have not made a pull request.
I ran into some issues with the pairwise alignment of sequences read from different files. Although the exact issue varies from case to case, it seems that the first alignment produces the correct result, while further alignments generally do not. Here's an example:
If a first file "f1" contains the following (identical) sequences:
and a second file "f2" contains the following (identical) sequences:
then running the command "smith_waterman --files f1 f2" returns the output:
The results for Alignment 0 are correct, but the fact that results are absent for Alignments 1 and 2 seems off. Interestingly, the dynamic programming matrices (rendered as output with the --printmatrices option) for the three alignments are the same, as they should be. The problem is therefore likely to be found at a superficial level.