SBRG / bigg_models

The BiGG Models website server
http://bigg.ucsd.edu
Other
77 stars 18 forks source link

Unclear entries in data_source #50

Closed draeger closed 9 years ago

draeger commented 9 years ago

The following resources from the data_source table are difficult to map to MIRIAM resources:

name Problem
EnsemblGenomes There are multiple databases: Bacteria, Fungi, Metazoa, Plants, Protists. I could lookup which one to use by analysing the linage of the organism based on its NCBI taxon id.
GI Unclear what database has this id.
IMGT/GENE-DB This could refer IMGT HLA or IMGT LIGM
MIM This could refer to one or neither of the databases ABS, MimoDB, OMIM, or Orphanet Rare Disease Ontology
PSEUDO I assume this refers to the Pseudomonas Genome Database, but it could also be Pathema (WARNING: deprecated!) or UniGene (WARNING: low up-time!)
UniProtKB/Swiss-Prot This could be UniProt Isoform or UniProt Knowledgebase. There are also other possible databases, but I assume thse are the main ones (see http://www.ebi.ac.uk/miriam/main/search?query=UniProtKB)
UniProtKB/TrEMBL Same as for the UniProtKB/Swiss-Prot case
old id What is this? Older BiGG release?
zakandrewking commented 9 years ago

@draeger says: "Only for Ensembl there seems a solution, but Nick says that if we do know a particular group (such as fungi, bacteria, etc), we can try the general ensembl entry: http://www.ebi.ac.uk/miriam/cura/collections/MIR:00000003 (http://identifiers.org/ensembl/). However, it might be difficult to figure out to which group some model belongs. We do have the taxon id and I could query the lineage of the taxon and see what it is. Would there be an easier way? Or should we just use the generic ensemble catalog (which might potentially not cover everything)?

IMGT/GENE-DB is not possible to link to, as they use multiple identifiers as far as I can see, which are organism specific. eg http://www.imgt.org/genedb/GENElect?query=6.1+IGHV1-2&species=Homo+sapiens&IMGTlabel=FR1-IMGT (6.1, IGHVV1, Homo sapiens and FR1-IMGT would all have to be parameters). there is more documentation here: http://www.imgt.org/genedb/doc#directlinksgene

We need to provide more information about what the remaining resources are:

zakandrewking commented 9 years ago

SEED should be accessible here: http://seed-viewer.theseed.org/

And SEED compounds can be accessed with this url: http://seed-viewer.theseed.org/seedviewer.cgi?page=CompoundViewer&compound=[SEED id]

UPA maps to the OBIWarehouse: http://www.grenoble.prabi.fr/obiwarehouse

I'm not sure about the others, MIM, PSEUDO. I'll take a look when I get a chance.

zakandrewking commented 9 years ago

See #109 for discussion of GeneID and GI.

pillmill commented 9 years ago

SEED seems much faster here: http://rast.nmpdr.org/seedviewer.cgi