ekg / gimbricate

recompute GFA link overlaps
MIT License
25 stars 4 forks source link

compute overlaps using edlib #6

Closed ekg closed 4 years ago

ekg commented 4 years ago

This resolves #3, #4 (by an unrelated patch) and will obviate the need for "perfectOvelaps" mode #5.

This should greatly improve the scalability of the alignment, and allow us to directly process extremely long overlaps without the memory issues we ran into previously.

Fully exploiting this may require setting a bandwidth to the alignment, but for long exact overlaps (~50kb), I have observed very good performance here. My basic understanding is that edlib implements an adaptive banded approach that should handle this.

ekg commented 4 years ago

This now switches between ssw and edlib when the overlap is >256bp (configurable via -m). My testing suggests it's fine.