collaborativebioinformatics / STRdust

MIT License
13 stars 3 forks source link

potential bug in merge_overlapping_insertions #23

Closed PavelAvdeyev closed 3 years ago

PavelAvdeyev commented 3 years ago

Hi @wdecoster. I think that merge_overlapping_insertions is bugged in both branches (lines 137 - 144).

        to_merge = [insertion]
        if insertion.merged:
            continue
        for candidate_merge in insertions[index+1:]:
            if insertion.is_overlapping(candidate_merge, distance=merge_distance):
                to_merge.append(candidate_merge)
            else:
                break

The field merged in class Insertion does not have True value ever. So, algorithm is not skipping merged insertions and working with O(n^2) time. I pushed to my branch modified version of merge_overlapping_insertions. Please, review it (commit, lines 176 - 183) and double check that everything is correct.