Database recommendations based on OMEX metadata specification

sys-bio / vscode-antimony

Extensions for Antimony development in Visual Studio Code.

MIT License

1 stars 0 forks source link

~~Currently, creating annotations rely on a third-party package for retrieving annotation information from several sources. There are two problems with this approach:~~

The source of the annotation is limited by the third-party package, and

The performance is not good.

One idea is to transform all of the data to our own cloud-based database and query from there, but since this will require some sort of funding, we need some data to back our claim:

How much faster will the process be if we host our own DB?

How large is the information? If the size is small, maybe we can include it in the extension package and read it from there?

What if we create indexes?

Copy right issues?

and also https://github.com/sys-bio/vscode-antimony/issues/48

bulk insert chebis from 'chebi.csv' with (ROWTERMINATOR = '0x0a', FIELDTERMINATOR = '~QwQ~', FIELDQUOTE = '!', DATA_SOURCE = 'blob2', FORMAT='CSV', CODEPAGE = 65001, --UTF-8 encoding FIRSTROW=1,TABLOCK); CREATE INDEX index1 ON [dbo].[chebis] (name);

import xmltodict import csv contents = open("chebi.xml").read() ch = xmltodict.parse(contents) chebis = ch["rdf:RDF"]["owl:Class"] f = open("chebi.csv", "w") for chebi in chebis: if "rdfs:label" in chebi.keys() and "oboI:id" in chebi.keys(): text = chebi["rdfs:label"]['#text'] id = chebi["oboI:id"]['#text'] f.write("!{}!~QwQ~!{}!\n".format(text, id))

Hi Steve,

Here's a link to the OMEX metadata specification that I mentioned today: https://doi.org/10.1515/jib-2021-0020

If you jump to section 2.4.2 (Resources to use for composite semantic annotations) there is a list of recommended ontologies and databases there.

Working from that list, I would recommend you support searches for the following SBML components like so:

For SBML compartments, support searching Gene Ontology:cellular component Cell Type Ontology Foundational Model of Anatomy Mouse Adult Gross Anatomy Ontology for Biomedical Investigations

For SBML species, support searching ChEBI Protein Ontology UniProt

For SBML reactions, support searching GO:biological process RHEA

This recommendation is based on what's in the OMEX metadata specification, which was created with a particular user community in mind. Your target users might have other ontologies or databases that they want to be able to search. For example, some people like to use KEGG, but it's proprietary and so isn't included in the recommended resources in the OMEX metadata spec. Just something to keep in mind.

Hope this is helpful.

M

sys-bio / vscode-antimony

Database recommendations based on OMEX metadata specification #27