Closed manulera closed 10 months ago
Hi @manulera sorry for the delayed response!
Just to make sure I understand, you're asking to have the the following genbank feature misc_feature join(14,1..2)
come in as a single feature with {start: 14, end:2 }
instead of as a single feature with 2 distinct locations of {start: 14, end:14}
and {start: 1, end: 2}
.
Is that right?
Another question I have - is that how all join()
's work if they aren't separated by at least one base pair?
Thanks!
Just to make sure I understand, you're asking to have the the following genbank feature misc_feature join(14,1..2) come in as a single feature with {start: 14, end:2 } instead of as a single feature with 2 distinct locations of {start: 14, end:14} and {start: 1, end: 2}. Is that right?
Yes, that's it.
is that how all join()'s work if they aren't separated by at least one base pair?
I am not sure, the origin wrap is the only use-case I can think of to create such a join location. I tried a feature join(10..11,12..14)
in Benchling and Snapgene to see what they do:
In summary, I think it should only merge the fragments of the join in the case that the join happens exactly at the origin (last base / first base)
As I said, I am happy to have a go at this one, if you give me some guidance.
Hi @tnrich just following up on this. Would you still be happy to accept a contribution on this?
Hi @manulera yep still happy to accept a PR on this one. I think this would be in the parse feature location code or nearabouts.
Hello @tnrich
Happy to have a go at this one myself if you agree, and if you give me some guidelines on where to start.
Apparently, for origin-spanning features in circular DNA, the syntax from NCBI is as follows (this is from an NCBI genome):
Basically, they use
complement(join(490883..490885,1..879))
instead ofcomplement(490883..879)
, which is what you would get if you created this feature in OVE, and what you often get from SnapGene files and files from AddGene. Biopython, the python library, adheres to the NCBI requirements, see this issue.Maybe the OVE library should interpret the file below as if it was
14..2
(check if join features are consecutive). That's what you get when you open the file in either SnapGene or Benchling. Let me know what you think