ga4gh / ga4gh-schemas

Models and APIs for Genomic data. RETIRED 2018-01-24
http://ga4gh.org
Apache License 2.0
214 stars 110 forks source link

Ontology Source Identification #695

Open david4096 opened 8 years ago

david4096 commented 8 years ago

In https://github.com/ga4gh/schemas/issues/621 @mcourtot mentions

With respect to the ontology source identification. At the moment, there is: Ontology source name - the name of ontology from which the term is obtained, e.g. 'Human Phenotype Ontology': I think this may give rise to multiple values for the same resource, e.g., 'Human Phenotype Ontology', 'HP', 'HPO' which is IMO not desirable. A solution to this is using a registry such as the ones @mellybelly mentions above - it also has the advantage that we can add pretty much whichever information we want to this registry, e.g. if the CURIE prefix expansion changes we can update that seamlessly (e.g. we have been using http://purl.bioontology.org/ontology/SNOMEDCT/{sctid} but then decide to update to the official SNOMED URIs as per http://doc.ihtsdo.org/download/doc_UriStandard_Current-en-US_INT_20140527.pdf, and it should be updated to http://snomed.info/sct/{sctid}) We could then have Ontology source name - the name of ontology from which the term is obtained, e.g. 'HP', as taken from the GA4GH resources registry.

This issue is meant to capture further discussion on Ontology Term representations, specifically, a discussion regarding how to control the vocabulary of source_name. @mbaudis @cmungall @mellybelly

mbaudis commented 8 years ago

I obv. support a mechanism for correct ontology source identification. However, should be a general mechanism, not being bound to a specific registry (e.g. local implementations may have their own instances; any given registry/service should be considered transient in the long run).

Suggestions, please?

mcourtot commented 8 years ago

We discussed this with @helenp, @cmungall and @simonjupp and would like to suggest relying on OLS services. OLS will keep a registry in sync with prefixcommons and we can leverage this without, as @mbaudis recommended, having to maintain a specific registry.

(note: this would also help address #165)