SynBioDex / SBOL-utilities

Command-line utilities and scripts for manipulating SBOL data
MIT License
15 stars 21 forks source link

Ensure all GenBank features are in term translation list #205

Open jakebeal opened 1 year ago

jakebeal commented 1 year ago

The official list of supported GenBank features may be found at: https://www.insdc.org/submitting-standards/feature-table/#7.2

Most, but not all, are in the gb2so.csv and so2gb.csv translation maps with equivalent Sequence Ontology terms; make sure that all are there.

arnav-luhadiya commented 9 months ago

Hi @jakebeal. What if the equivalent Sequence Ontology term of the feature is not defined. Should I not write the feature itself or just leave the SO equivalent term?

jakebeal commented 9 months ago

We want to make sure that every GenBank feature has a corresponding SO term.

Given that SO has a much richer vocabulary, I suspect that every feature should map to a term. If there are features that don't appear to have a corresponding term, or that are ambiguous, however, it should be discussed on this issue. If necessary, we can also put in placeholders and ask for new SO terms.

arnav-luhadiya commented 9 months ago

I could not find corresponding term of old_sequence for example. Checked the following links : Sequence Ontology Search Sequence Ontology Mapping

Attaching a relevant screenshot from the (Sequence Ontology Mapping) link:

image

jakebeal commented 9 months ago

Oh, that one's interesting... I've never actually encountered it on a sequence myself, so I was not familiar with it. And the mapping link that you found is excellent: I was not familiar with that before.

So given this, here's my thought on how to proceed:

  1. Anything that there's a mapping for in the Sequence Ontology Mapping table, make sure we've got it right
  2. Anything in grey on the mapping that we haven't already have a term for, goes into generic Sequence Feature (SO:0000110)
  3. Anything in the GenBank feature table that doesn't have a mapping and isn't on the Sequence Ontology Mapping table, probably needs to be discussed here.