ratschlab / metagraph

Scalable annotated de Bruijn graphs for DNA indexing, alignment, and assembly
http://metagraph.ethz.ch
GNU General Public License v3.0
113 stars 17 forks source link

Use BOSS indexing in DBGSuccinct + make RowDiff independent #484

Open adamant-pwn opened 6 months ago

adamant-pwn commented 6 months ago

WIP pull request for CI.

adamant-pwn commented 4 months ago

Hi @hmusta, can you took a look here? I think it's still far from complete, but I'd still apreciate some input on the general direction to make sure that what I'm doing is aligned with what we want.

adamant-pwn commented 1 month ago

First of all, I'll wait for workflows to run through to make sure it still works as intended on DBGSuccinct. After that, I wonder if there is some simple enough way to test this with some other graph types. Some of my older notes from a few months ago say the following:

Would be nice to check whether it is still relevant.

hmusta commented 1 month ago

We can test this by modifying the following block in build_anno_graph to work with DeBruijnGraph: https://github.com/ratschlab/metagraph/blob/3da66a968c9ce757a3855ff67de8ea0abaec32a0/metagraph/tests/annotation/test_annotated_dbg_helpers.cpp#L92

Then, we can add instantiations for other graph types (let's start with DBGHashFast for now) with RowDiff annotations in these places: https://github.com/ratschlab/metagraph/blob/3da66a968c9ce757a3855ff67de8ea0abaec32a0/metagraph/tests/annotation/test_annotated_dbg_helpers.cpp#L245 https://github.com/ratschlab/metagraph/blob/3da66a968c9ce757a3855ff67de8ea0abaec32a0/metagraph/tests/annotation/test_annotated_dbg.cpp#L1002 https://github.com/ratschlab/metagraph/blob/3da66a968c9ce757a3855ff67de8ea0abaec32a0/metagraph/tests/annotation/test_annotated_dbg.cpp#L1015 https://github.com/ratschlab/metagraph/blob/3da66a968c9ce757a3855ff67de8ea0abaec32a0/metagraph/tests/annotation/test_aligner_labeled.cpp#L53

adamant-pwn commented 1 month ago

I hopefully fixed all unit tests. It seems that some integration tests still fail, so I'll look into it further over the next days. In the meantime I think it'd be nice to try to get it reviewed and merged, as the change becomes large and difficult to manage within one PR. Then, the tests for other graphs can be added in a separate PR.