Closed jamesaoverton closed 4 years ago
Thanks @jamesaoverton I'll have a deeper look towards end of next week. At first glance, I would remove the non-entity roots under classes - maybe need to make sure there is an AP on the CINECA terms to indicate provenance.
I'm wondering what is the right balance - if we want to provide a minimal model, this seems to detailed to me. On the other hand, the CINECA only file is too sparse. Is there an easy way for me to go through, indicate the lower-level term to be kept in each branch, and then you would slim that up?
There also seems to be wrong assertions - eg under blood, which is defined as a sample type, but then all subclasses are measurement values. Is the best for me to try and clean that up in OWL directly, or what would be easier on your side?
Keep in mind that an ontology doesn't have to be finished to request an OBO ID. The project just has to demonstrate commitment to the OBO principles and best practises.
I agree about finding the right balance. Ultimately that's your decision. If you want to resolve those tensions by making a manual version of gecko.owl
in Protege, please go ahead.
I'll try an explain the reasoning behind the current file:
We were tasked with harmonizing various data dictionaries under CINECA, none of which were designed with ontologies or ontological principles in mind. So we provided ontological interpretations of the various terms, and used them to build some really good mappings that take advantage of the ontological hierarchies and axioms.
But at the end of the day, CINECA is not an ontology, so we're left with a hybrid that combines the familiar groupings that CINECA provides with the ontological hierarchy we need to define them. When viewed together as a single tree, without that context, the mixture is confusing. When viewing just the CINECA tree or just the ontology tree, you loose the other perspective. I don't have a solution to suit all audiences.
To address your example: CINECA includes 'blood' under 'biosample'. But looking how they define and use the term 'blood', we can see that they don't actually want to talk about the sample, they want to talk about the measurement value. So we used the CMO term for 'blood measurement', but since all CMO terms are supposed to be values rather than processes, we relabelled it as 'blood measurement value'. Then we asserted that 'blood measurement value' is equivalent to CINECA 'blood'. We could go a step further and eliminate CINECA 'blood'.
@beckyjackson will do some clean up, and then we'll see where we stand.
Are you ok to proceed with submission, or should I do it, or are we awaiting further input?
The OBO request has been approved. I've opened PRs for adding the registry metadata (https://github.com/OBOFoundry/OBOFoundry.github.io/pull/1257) & PURL config (https://github.com/OBOFoundry/purl.obolibrary.org/pull/679).
I merged the registry and PURL entries, so this is done.
@mcourtot Please look at this "full" version of GECKO:
If you're happy with it, I will add the OWL to this repo, update OLS, and we can ask for an OBO IDSPACE. This is the information that OBO requires: