marbl / MetagenomeScope

Visualization tool for (meta)genome assembly graphs
https://marbl.github.io/MetagenomeScope/
GNU General Public License v3.0
24 stars 8 forks source link

Support duplicate (aka parallel) edges in GFA / FASTG files? #239

Open fedarko opened 1 year ago

fedarko commented 1 year ago

I'm working on #75 right now (the parsing functions have been updated, but I need to update the surrounding parts of the codebase). Right now (at least in this branch), LastGraph, GML, and DOT files containing duplicate edges are all supported. These filetypes' corresponding parsing functions return nx.MultiDiGraph objects, rather than just nx.DiGraph objects like before.

The GFA and FASTG parsing functions also return nx.MultiDiGraph objects, but these functions actually delegate the work of parsing to other libraries (GfaPy and pyfastg, respectively), both of which do not allow duplicate edges. This means that, although we could support duplicate edges for these filetypes, trying to run MetagenomeScope on them will lead to an error being raised by the library to which we delegate.

I don't think addressing this is an urgent or important issue, since I haven't seen many GFA or FASTG files containing duplicate edges (I'm not even sure how you'd express duplicate edges in SPAdes-dialect FASTG files). I'm just documenting this issue here, separately from #75, so that when #75 is addressed we will still have this thread open.