jts / sga

de novo sequence assembler using string graphs
http://genome.cshlp.org/content/22/3/549
237 stars 82 forks source link

Support Graphical Fragment Assembly (GFA) format #80

Closed sjackman closed 8 years ago

sjackman commented 9 years ago

Hi, Jared. Any plans to implement support for GFA in SGA? The next release of ABySS will include support for GFA.

sjackman commented 9 years ago

Somewhat related, ABySS includes support for ASQG. It currently uses XC for the sum k-mer coverage of a contig. May I use KC instead? Or more generally, can we declare all segment/edge properties of GFA are also valid in ASQG?

See http://lh3.github.io/2014/07/23/first-update-on-gfa/#comment-1574499562

jts commented 9 years ago

Hi Shaun,

I want to support GFA but I've had very little coding time recently. I'll try to get around to it but it won't be for a few weeks at the earliest. Sorry.

jts commented 9 years ago

For your second question, I'm fine with KC and in general will support GFA properties in ASQG.

I don't expect many naming clashes between ASQG and GFA. There are few reserved tags in ASQG anyway. I have an internal build of SGA that outputs CG and PI for the CIGAR string and percent identity.

sjackman commented 9 years ago

No worries. It's not a big rush, but I am keen to see other implementations of GFA. GFA is the closest that we've come as a community to a standard for which we have some agreement. I really want to see it adopted. The more implementations of GFA, the more likely that is to happen.

sjackman commented 9 years ago

Why PI and not NM (number of mismatches)?

jts commented 9 years ago

I used PI for legacy reasons - it was the bit of information I needed to pass down the pipeline. I'm not tied to this tag and will change it if NM is the standard.

sjackman commented 9 years ago

Well there's no GFA standard yet—it's the wild west out here—but at least NM is a SAM standard.

NM i Edit distance to the reference, including ambiguous bases but excluding clipping

sjackman commented 8 years ago

The GFA spec has more-or-less stabilized: https://github.com/pmelsted/GFA-spec