yangao07 / abPOA

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band
MIT License
118 stars 18 forks source link

abPOA as a library -- adding nodes to the graph #28

Closed HopedWall closed 1 year ago

HopedWall commented 3 years ago

Hello, I am trying to use abpoa as a dependency for one of my projects (https://github.com/HopedWall/rs-vgaligner), where I need to align a (sub)graph against some query sequences.

In my project each node is labelled with a sequence, so I'm trying to replicate the same graph structure in abpoa. In the abpoa.h file, I noticed the abpoa_add_graph_node function, however it seems to only accept a single base. Is there any way to create nodes with longer sequences?

Thank you!

yangao07 commented 3 years ago

The nodes in abPOA graph are all 1-base nodes. For your case, you can choose to add every single base one by one, this is also how abPOA creates the initial POA graph using the very first query sequence.

The codes here(L1108) show how to add the query sequence to the empty graph.

Yan

HopedWall commented 3 years ago

Thanks Yan, I'll try using the function you suggested! Will keep you updated.

By the way, I also created FFI bindings for abPOA in Rust, so that even more people can use this awesome tool. Check them out at this link: https://github.com/HopedWall/rs-abPOA. Let me know what you think!

yangao07 commented 3 years ago

Thanks so much for writing the Rust repo for abPOA! Great work!

I am still a new learner of Rust, but I think this is really nice and will make abPOA more widely used. I will try it out sometime and maybe provide some suggestions then.

Yan

HopedWall commented 3 years ago

I've managed to build a simple graph via the abpoa_add_graph_node and abpoa_add_graph_edge functions, however when I try to align a new sequence with abpoa_align_sequence_to_graph I get a segfault .You can check the code here.

Am I doing something wrong?

[edit] I just noticed that I was passing new_seq instead of new_bseq when aligning the sequence, however changing that did not fix the issue.

yangao07 commented 3 years ago

If all your sequences need to be aligned with abPOA, you can simply feed them to abpoa_msa, so that it will handle all the sequences. The very first sequence can be added manually because there is no graph to be aligned yet. However, for the second sequence, it has to be aligned to the graph (which only consists of the first sequence) and then added to the graph based on the alignment result. I hope this is clear to you.

yangao07 commented 3 years ago

If you have a graph that needs to be transformed to an abPOA graph, you have to add each node one by one, following a specific order, say DFS or BFS. You should not simply add the linear sequence to the abPOA graph.

HopedWall commented 3 years ago

Thanks again for your answers @yangao07.

Unfortunately my example wasn't really clear, but the sequences that I was trying to add were the labels of the nodes. I'm working with variation graphs, you can find an example here.

More specifically, I'm trying to align query sequences to such graphs. So my idea was:

  1. convert each node of the variation graph into multiple abpoa nodes (in my code the "nodes" were seq and seq2)
  2. perform an alignment between "new" query sequences and the graph made in step 1 (in my code the query was new_seq)

Do you think this could work? If so, should I use different functions?

yangao07 commented 3 years ago
  • convert each node of the variation graph into multiple abpoa nodes (in my code the "nodes" were seq and seq2)

Then you can try it this way:

If you have a graph that needs to be transformed to an abPOA graph, you have to add each node one by one, following a specific order, say DFS or BFS. You should not simply add the linear sequence to the abPOA graph.

Make sure you transformed every node and every edge so that you won't get an error when performing the alignment in the 2nd step.

Overall, your whole idea should work.

hangsuUNC commented 1 month ago

Hi,

Thanks for this wonderful tool and the rust binding! I was trying to use the rust binding of abPOA in my own project. It compiles well but when it runs consensus_from_seqs, it has a segmentation fault.

Here is my rust code running this section:

            let sequences: Vec<&str> = input_seq.iter().map(AsRef::as_ref).collect::<Vec<_>>();
            let mut aligner = unsafe { ab_poa::abpoa_wrapper::AbpoaAligner::new_with_example_params() };
            let consensus = unsafe {
                aligner.consensus_from_seqs(&sequences)
            };

Could you please provide some suggestions how to resolve this?

Thanks so much,

Hang Su