Closed Anjan-Purkayastha closed 1 week ago
Here is another example: edlib.align('ATGC', 'ATGTATGC', mode = 'HW', task = 'locations', k = 1)
Expected result: The following locations will be identified (0,3) - 1 mismatch; (4,7)- 0 mismatch Instead, here is the output: {'editDistance': 0, 'alphabetLength': 4, 'locations': [(4, 7)], 'cigar': None} Only the perfect match is reported, location with 1 mismatch is not reported. Another error: If match starts at first position and ends at, say, position 8, location report (None, 8). This should be corrected to (0,8).
Hey @Anjan-Purkayastha -> this is not a bug, it is how edlib works, I am sorry if this was not obvious from the docs.
Check the comments here https://github.com/Martinsos/edlib/blob/master/edlib/include/edlib.h -> so what k
means is that you don't care about the solution if edit distance is larger than k
. But that doesn't mean it will return all solutions below k
-> it will always return only one solution. You can see that in the result object -> it returns a single edit distance and single end location. It can return multiple possible start locations though, but that is it.
If you think this is unclear in the docs, I would appreciate a PR that would clear it up!
Describe the bug I am running the edlib.align function to identify a test primer sequence in a longer template sequence. There are two locations that the test_primer sequence is embedded in the test_template. These are at positions: 9-29 with 0 mismatches, and at position: 140-160, with 1 mismatch at position 145. When I run edlib.align specifying at most 3 mismatches, edlib identifies only the position with 0 mismatches. Position with 1 mismatch is not displayed.
To Reproduce Code to run: left_alignment = edlib.align(test_primer, test_template, mode='HW', task='locations', k = 3 ) Please use the attached files.
Expected behavior Since maximum mismatch is set at 3 I expect to see both alignments reported.
Environment (please complete the following information):
Additional context