ekg / seqwish

alignment to variation graph inducer
MIT License
143 stars 18 forks source link

output repeat-max elided bases to improve runtime #41

Closed ekg closed 4 years ago

ekg commented 4 years ago

The last commit has us not writing repeat-max'ed bases until the next loop through the transitive closure processing. This is really inefficient, as we have to keep exploring the links in the implicit interval tree to re-find them. This is fixed here by just putting them in a list of bases to emit later.

ekg commented 4 years ago

This changes the behavior of this on my tests, but the result seems more reasonable and less jumbled than the previous version. I think it's better that the duplicated bases are emitted immediately after they're detected rather than via another iteration of the transitive closure step.