asl / BandageNG

a Bioinformatics Application for Navigating De novo Assembly Graphs Easily
GNU General Public License v3.0
112 stars 10 forks source link

segmentation fault on branch walks #150

Closed dirkjanvw closed 1 month ago

dirkjanvw commented 6 months ago

As mentioned in #149 I get a segmentation fault on branch walks, which I don't get on branch dev with my Macbook Pro. Example GFA file I obtained from the Minigraph-Cactus pipeline can be found here: mc-pangenome.gfa.gz. Let me know if there is anything else I can provide to figure out why this happens.

asl commented 6 months ago

Ok, so the problem is that W line appear before L. As we are doing stream parsing we build graph on-fly. Therefore at the time of W record processing there are no edges between segments / nodes. I will fix, so the proper message / exception will be thrown.

asl commented 6 months ago

We do not segfault after https://github.com/asl/BandageNG/commit/a6adcc353e50cb16b17bae477a06582c94818672

However, the producer of this GFA should be fixed. Per GFA spec – "A valid walk must exist in the graph". We treat this quite strict as otherwise we will need essentially to save the copy of entire GFA before processing.

dirkjanvw commented 6 months ago

Thank you for the quick fix! I now "sorted" the graph by putting all W-lines at the end of the file, I think that should solve the processing issue? However, it doesn't complete parsing. I let BandageNG info mc-pangenome.sorted.gfa run for an hour before I killed the process. Were you able to parse this GFA file in reasonable time?

asl commented 6 months ago

Can you attach the new file, so I can check?

dirkjanvw commented 6 months ago

Of course: mc-pangenome.sorted.gfa.gz I had to gzip the file, wheareas I ran BandageNG info with the uncompressed version, assuming this has no effect.

asl commented 5 months ago

Of course: mc-pangenome.sorted.gfa.gz I had to gzip the file, wheareas I ran BandageNG info with the uncompressed version, assuming this has no effect.

Sorry, somehow I forgot to respond. Looks like it was me reading GFA spec improperly. Need to push the fix

asl commented 5 months ago

Should be fixed in 24907eb

dirkjanvw commented 5 months ago

Thanks a lot, it works for me now!