Closed rlorigro closed 1 year ago
Ok, this is an outcome how we're matching path names in CSV. Path names are matched by prefix, so in your particular case node name t
matched against paths tip.0
and tip.1
. I do not recall why we matched paths before nodes. Current workaround is to rename / remove paths.
I will revise the logic here.
Ah, yes, I remember why we're checking paths first. Unfortunately, when exporting to FASTA Bandage exports node names in the format like NODE_6+_length_50434_cov_42.3615
which obviously clashes with the standard naming of paths in e.g. SPAdes assembly graphs.
While querying for node name it transforms NODE_6+_length_50434_cov_42.3615
into just 6+
. This logic obviously could transform path name into some valid node name and do some wrong things.
Probably, the only sane solution is to require explicit path name, not just prefix check.
Hey @asl could this potentially be added to a new release? Perhaps you could switch from monthly to a yearly release schedule :)
@rlorigro We're having rolling releases these days: https://github.com/asl/BandageNG/releases/tag/continuous
So, you could always grab the latest snapshot
The same GFA and CSV produce different results in OG vs NG Bandage:
Bandage NG
Original Bandage
CSV: https://rlorigro-public-files.s3.us-west-1.amazonaws.com/gfase/test_gfa/chainable_nodes.csv
GFA: https://rlorigro-public-files.s3.us-west-1.amazonaws.com/gfase/test_gfa/chain_test.gfa