Closed manulera closed 2 months ago
Looks good to me, thanks @manulera !
Merged and published!
@tnrich where is this published? I don't see a new release for @teselagen/bio-parsers
https://www.npmjs.com/package/bio-parsers?activeTab=versions
@manulera hmm you're right. I'll look into actually getting it published hah
@manulera https://www.npmjs.com/package/@teselagen/bio-parsers seems like it is publishing fine. I think you're looking at the old deprecated version of bio-parsers
I'll try to update that one so it is clearer that it is no longer in use.
@manulera ok, deprecated that package on npm so it will hopefully be clearer in the future!
Hello @tnrich, I noticed there were a couple of errors on
genbankToJson
:Incorrectly parsing composed features with single positions
Features like
join(1,3..4)
would be incorrectly parsed because the function used to read pairs of subsequent integers as start-end of locationsThis would incorrectly interpret this feature as if it was
join(1..3,4)
. A fix for this is proposed in the PR.Origin-spanning features not being parsed correctly when reading gb files
This is a followup to #47.
According to gb rules, origin-spanning features described as a join (e.g.
join(19..20,1)
in a circular sequence of length 20, which is equivalent to{start: 18, end: 0}
in tesela's json.The previous fix from #47 was not enough, because when
parseFeatureLocation
is called insidegenbankToJson
, the sequence has not been parsed yet, so we cannot use the length of the sequence to know where the origin is.What I have done is:
wrapOriginSpanningFeatures
that takes as an input the.locations
array and merges joins likejoin(19..20,1)
.genbankToJson
, this function is called inendSeq > postProcessCurSeq > postProcessGenbankFeature
, which seemed to make sense.parseFeatureLocation
as a standalone, if you pass the sequenceLength,wrapOriginSpanningFeatures
is also called.I have added tests for these cases as well.
Let me know if I should change something else.