ggonnella / gfapy

Gfapy: a flexible and extensible software library for handling sequence graphs in Python
Other
64 stars 6 forks source link

gfa-convert: custom sl:i tag ought to be LN:i #5

Closed sjackman closed 7 years ago

sjackman commented 7 years ago

I'm guessing here that sl:i is sequence length. The standard name for this tag is LN:i See https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md#optional-fields-2 Observed output

S   A   AAAAAAACGT  sl:i:10

Expected output

S   A   AAAAAAACGT  LN:i:10
ggonnella commented 7 years ago

It was LN, and I changed it to sl, as the content of the slen tag does not really need to be the length of the segment, according to the GFA2 specification. Maybe, if the sequence is available, I can change it to LN if it agrees with its length.

sjackman commented 7 years ago

I never did like that bit of the GFA 2 spec. It's peculiar. I would go with LN:i all the same. If the sequence is not present, then I would use LN:i. If the sequence is present, then I wouldn't include any LN:i tag at all.

ggonnella commented 7 years ago

I also do not like it, I never understood the reason for that...

sjackman commented 7 years ago

If the sequence is present, then I wouldn't include any LN:i tag at all.

On second thought, do you think it's helpful to always include the LN:i tag in GFA 1 output?

ggonnella commented 7 years ago

Yes, I think that if the information is there, one can output it.