Open cjmyers opened 3 years ago
Note that these are not so much official as a community-sources list of things that people often find in GenBank. Coping in from the spreadsheet:
GenBank | SO Term | SBOL Visual | ||
---|---|---|---|---|
allele | SO:0001023 | |||
attenuator | SO:0000140 | |||
C_region | SO:0001834 | |||
CAAT_signal | SO:0000172 | |||
CDS | SO:0000316 | cds | ||
D-loop | SO:0000297 | origin-of-replication | ||
D_segment | SO:0000458 | |||
enhancer | SO:0000165 | |||
exon | SO:0000147 | can be shown as CDS minus intron | ||
gene | SO:0000704 | |||
GC_signal | SO:0000173 | |||
iDNA | SO:0000723 | |||
intron | SO:0000188 | intron | ||
J_region | SO:0000470 | |||
LTR | SO:0000286 | |||
mat_peptide | SO:0000419 | |||
misc_binding | SO:0000409 | operator | ||
misc_difference | SO:0000413 | |||
misc_feature | SO:0000001 | |||
misc_marker | SO:0001645 | |||
misc_recomb | SO:0000298 | |||
misc_RNA | SO:0000233 | |||
misc_signal | SO:0001411 | |||
misc_structure | SO:0000002 | |||
modified_base | SO:0000305 | |||
mRNA | SO:0000234 | |||
N_region | SO:0001835 | |||
polyA_signal | SO:0000551 | |||
polyA_site | SO:0000553 | poly-a | ||
precursor_RNA | SO:0000185 | |||
prim_transcript | SO:0000185 | |||
primer | SO:0000112 | |||
primer_bind | SO:0005850 | primer-binding-site | ||
promoter | SO:0000167 | promoter | ||
protein_bind | SO:0000410 | operator | ||
RBS | SO:0000139 | rbs | ||
rep_origin | SO:0000296 | origin-of-replication | ||
repeat_region | SO:0000657 | |||
repeat_unit | SO:0000726 | |||
rRNA | SO:0000252 | |||
S_region | SO:0001836 | |||
satellite | SO:0000005 | |||
scRNA | SO:0000013 | |||
sig_peptide | SO:0000418 | |||
snRNA | SO:0000274 | |||
source | SO:0000149 | |||
stem_loop | SO:0000313 | |||
STS | SO:0000331 | |||
TATA_signal | SO:0000174 | |||
terminator | SO:0000141 | terminator | ||
transit_peptide | SO:0000725 | |||
transposon | SO:0001054 | |||
tRNA | SO:0000253 | |||
V_region | SO:0001833 | |||
variation | SO:0001060 | |||
-10_signal | SO:0000175 | |||
-35_signal | SO:0000176 | |||
3'clip | SO:0000557 | |||
3'UTR | SO:0000205 | |||
5'clip | SO:0000555 | |||
5'UTR | SO:0000204 | |||
regulatory | SO:0005836 | |||
snoRNA | SO:0000275 | |||
none of above | SO:0000110 | unspecified | ||
GenBank | SO Term | Synonyms: | SO Term | |
assembly_gap | NA | gap | SO:0000730 | |
centromere | SO:0000577 | |||
gap | SO:0000730 | |||
J_segment | J_gene_segment | SO:0000470 | ||
mobile_element | NA | mobile_genetic_element | SO:0001037 | |
ncRNA | SO:0000655 | |||
old_sequence | NA | |||
operon | SO:0000178 | |||
oriT | SO:0000724 | |||
propeptide | SO:0001062 | |||
telomere | SO:0000624 | |||
tmRNA | SO:0000584 | |||
unsure | sequence_uncertainty | SO:0001086 | ||
V_segment | V_gene_segment | SO:0000466 |
(edited to add missing information)
Looking at these, I'm not sure how many we actually really need to fill in, until such time as somebody wants them. How often does a synbio system actually work with rRNA, for example? A better solution might be to have an alternative to the "no glyph assigned" bracket that looks nicer on diagrams.
I think several of these are represented at some level, for which the question is whether or not to make a specialty glyph for it or formalize repurposing an existing one for the SO terms.
attenuator - terminator (conditionally repressed based on translation rate of upstream leader peptide; not ever used in syn bio) oriT - origin of transfer glyph. polyA_signal - no different than polyA site? transit peptide / signal peptide - protein location? This superimposition of protein stem glyphs with CDS/domain glyphs raises an old, unresolved question from issue 78. J_segment and V_segment are exons specific to the V(D)J recombination locus, and we have exon glyphs.
These are special kinds of protein-binding sites, the latter 4 inside promoters which I think can just be labeled inside the protein-binding site glyph, especially the last three.
These RNA terms can use an ncRNA gene or squiggly RNA backbone, depending on whether the DNA gene or RNA itself is being referred to. We don't have different CDS glyphs for different types of proteins. No need to have different ncRNA glyphs for different ncRNAs, like rRNAs or tRNAs, I think.
And these terms bring up unresolved issue #113 :
Good point. We likely should have more alternative SO terms for some of our glyphs based upon this.
https://docs.google.com/spreadsheets/d/1X870i3NhO7xEhqhLXK4eravNd72x-O-xbrpmlT835nY/edit?usp=sharing