yangao07 / abPOA

abPOA: an SIMD-based C library for fast partial order alignment using adaptive band
MIT License
118 stars 18 forks source link

abPOA prune graph #71

Open cgroza opened 4 months ago

cgroza commented 4 months ago

Hi,

Is it possible to delete nodes from a graph, then remap reads to it?

If so, could I have small snippet showing how? It's not clear to me which fields I need to update when I delete a node.

yangao07 commented 4 months ago

Hi, do you have an example case where specific nodes need to be deleted?

cgroza commented 4 months ago

In error correction of long reads with variation graphs. As in VeChat https://www.biorxiv.org/content/10.1101/2022.01.30.478352v1

-------- Original Message -------- On 6/24/24 3:04 PM, Yan Gao wrote:

Hi, do you have an example case where specific nodes need to be deleted?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

yangao07 commented 4 months ago

Thanks! The overall idea of this paper looks interesting. abPOA does not support APIs for deleting nodes right now.

For diploid input reads, it is indeed a promising scenario for abPOA. We may implement something specific for it, including deleting nodes.

cgroza commented 4 months ago

That's great. In the meantime, any pointers in how I could implement this myself?

-------- Original Message -------- On 6/24/24 4:39 PM, Yan Gao wrote:

Thanks! The overall idea of this paper looks interesting. abPOA does not support APIs for deleting nodes right now.

For diploid input reads, it is indeed a promising scenario for abPOA. We may implement something specific for it, including deleting nodes.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

yangao07 commented 4 months ago

Sure, you are very welcome to implement! Here is the code to add node/edge. So to remove node/edge, you need something similar.

cgroza commented 4 months ago

Hi Thank you for the pointers.

I did manage to remove nodes and edges and the GFA output shows that the graph looks good. I tried to align reads to the pruned graph and this also works fine in most cases.

However, sometimes I prune the graph and then run the topological sort, I get the error: "Failed to set node index".

I tried to read the source code but no obvious cause presents to me. Am I forgetting to update some fields?

yangao07 commented 4 months ago

In abPOA, graph updating includes adding edges, adding nodes, and adding aligned nodes (for mismatch bases). So, to remove, you may also remove everything about the node, including edges and aligned nodes. I am not 100% sure, but did you remove the aligned_node_id for those nodes?

cgroza commented 4 months ago

I did not remove anything about aligned node id. I don't understand that part of the data structure.

How would I identify the aligned node IDs to be removed, given a node id.

-------- Original Message -------- On 7/17/24 4:44 PM, Yan Gao wrote:

In abPOA, graph updating includes adding edges, adding nodes, and adding aligned nodes (for mismatch bases). So, to remove, you may also remove everything about the node, including edges and aligned nodes. I am not 100% sure, but did you remove the aligned_node_id for those nodes?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

yangao07 commented 4 months ago

Actually, I think the aligned nodes may not be the issue. Did you clean the out_edge_n/out_id/in_edge_n/in_id for nodes around the deleted nodes?