marschall-lab / panacus

Panacus is a tool for computing statistics for GFA-formatted pangenome graphs
MIT License
85 stars 5 forks source link

check both start and end for `coords` #6

Closed AndreaGuarracino closed 1 year ago

AndreaGuarracino commented 1 year ago

This fix avoids

thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/graph.rs:362:49
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

and helped me in debugging https://github.com/pangenome/pggb/issues/299.

danydoerr commented 1 year ago

Thanks for 1) finding the bug 2) suggesting a fix! I have a bad feeling that there's more to it than this, because if a path segment has an end, it must have a start. Can you provide me with the graph?

danydoerr commented 1 year ago

@AndreaGuarracino Actually, I just need to know one path ID to see what went wrong in parsing it :)

AndreaGuarracino commented 1 year ago

Enjoy!

gunzip smoothxg_block_0.gfa.gz

anacus table -a -c edge smoothxg_block_0.gfa | grep -e '\s0$'
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/graph.rs:362:49
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

smoothxg_block_0.gfa.gz

danydoerr commented 1 year ago

Awesome, thanks!

danydoerr commented 1 year ago

@AndreaGuarracino Question: How do I have to interpret these coordinates: HG02572#1#JAHAOW010000052.1:1087990-1165222_77032?

AndreaGuarracino commented 1 year ago

HG02572#1#JAHAOW010000052.1:1087990-1165222 is the original name of the path in the graph in input for smoothxg (here I am smoothing a subgraph for debugging, so you have a coordinate range in the name, that is 1087990-116522). The second value, 77032, indicates where the starting coordinate of a range in HG02572#1#JAHAOW010000052.1:1087990-1165222 (smoothxg applies POA on graph blocks, that corresponds to pieces of paths). So, in general, it is ORIGINALPATHNAME_START.