Define when it is OK to subclass terms in another ontology

cmungall commented 2 years ago

This is a companion issue to:

1443

But that issue focuses on injection which I define as adding axioms about terms another ontology (this is clearly defined in that issue, don't bring the discussion back here)

This issue is about when it is OK to make axioms that are not about terms in another ontology but that reference them in subClassOf axioms, in particular subClassOf between named classes.

On the surface this should be OK - I am an not altering the target ontology axioms in any way. Indeed some ontologies such as COB and BFO and CARO are designed expressly with the intention they are subclassed. To a certain extent uberon is too, although only for species-specific subclasses.

However, subclassing others ontologies is rampant in OBO, and this is actually harmful. It is poor modularity and it leads to confusion about scope. Users are not clear which ontology to go to get a term or to request a term.

It is also terrible for maintainability. If I maintain an ontology O1, containing class C1, and another ontology O2 starts makes subclasses, C1a, C1b, and so on. Then if I later need to introduce subclasses in O1, I need to first scan all OBO to see who has made subclasses and coordinate with these ontologies. This places a large impediment for maintainability.

Here is an example of what I call a heavily chequered inter-ontology subclass pattern, where there is a lack of clarity (to an external user about what belongs in STATO, OBI, or IAO):

subject	predicate	object	subject_label	predicate_label	object_label
STATO:0000002	rdfs:subClassOf	IAO:0000030	digital file	subClassOf	information content entity
STATO:0000003	rdfs:subClassOf	OBI:0500000	balanced design	subClassOf	study design
STATO:0000005	rdfs:subClassOf	OBI:0500000	single factor design	subClassOf	study design
STATO:0000007	rdfs:subClassOf	IAO:0000573	axis	subClassOf	line graph
STATO:0000010	rdfs:subClassOf	IAO:0000030	coordinate system	subClassOf	information content entity
STATO:0000026	rdfs:subClassOf	IAO:0000400	cartesian spatial coordinate origin	subClassOf	cartesian spatial coordinate datum
STATO:0000027	rdfs:subClassOf	OBI:0000673	test of association between categorical variables	subClassOf	statistical hypothesis test
STATO:0000028	rdfs:subClassOf	IAO:0000109	measure of variation	subClassOf	measurement datum
STATO:0000029	rdfs:subClassOf	IAO:0000109	measure of central tendency	subClassOf	measurement datum
STATO:0000031	rdfs:subClassOf	OBI:0200000	binary classification	subClassOf	data transformation
STATO:0000034	rdfs:subClassOf	IAO:0000027	model parameter	subClassOf	data item
STATO:0000036	rdfs:subClassOf	IAO:0000027	outlier	subClassOf	data item
STATO:0000038	rdfs:subClassOf	OBI:0000181	matched pair of subjects	subClassOf	population
STATO:0000039	rdfs:subClassOf	IAO:0000109	statistic	subClassOf	measurement datum
STATO:0000040	rdfs:subClassOf	IAO:0000184	MA plot	subClassOf	scatter plot
STATO:0000044	rdfs:subClassOf	OBI:0200201	one-way ANOVA	subClassOf	ANOVA
STATO:0000045	rdfs:subClassOf	OBI:0200201	two-way ANOVA	subClassOf	ANOVA
STATO:0000046	rdfs:subClassOf	OBI:0500000	block design	subClassOf	study design
STATO:0000047	rdfs:subClassOf	IAO:0000109	count	subClassOf	measurement datum
STATO:0000048	rdfs:subClassOf	OBI:0200201	multiway ANOVA	subClassOf	ANOVA
STATO:0000063	rdfs:subClassOf	IAO:0000027	genomic coordinate datum	subClassOf	data item
STATO:0000065	rdfs:subClassOf	IAO:0000030	hypothesis	subClassOf	information content entity
STATO:0000066	rdfs:subClassOf	IAO:0000037	Cleveland dot plot	subClassOf	dot plot
STATO:0000068	rdfs:subClassOf	IAO:0000027	skewness	subClassOf	data item

(truncated)

To replicate with OAK:

stato roots -p i --id-prefix STATO | stato relationships - -p i

Proposal:

Ontologies MUST NOT create is-a children of classes in other ontologies in their own ontology, unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis. This would be recorded in OBO metadata, e.g. for COB, BFO, CARO. OBI could choose to grant permission in this way, preferably with a link to some kind of documentation that states the relative scope of the two ontologies.

cmungall commented 2 years ago

Here is a visual illustration of the problem:

stato-obi-iao

I'm not sure how IAO/STATO/OBI coordinate which term goes where, but this is very confusing for a user who either needs to select terms, even more so if they need to figure out which issue tracker to go to in order to select new terms

matentzn commented 2 years ago

I not only like this, I think it is very necessary and already reflected by the "Scope" principle (which is not very well fleshed out right now, https://obofoundry.org/principles/fp-005-delineated-content.html). This is how I would like to attack it:

All major branches are reflected in COB (data transformation, study design, measurement datum, disease, anatomical entity etc). COB metadata points (maps) to all branches in active OBO ontologies, which establishes the ontologies which have theoretical permission to host terms. For example, DO, NCIT, Mondo disease branches point to COB:disease and can all serve as hosts for new terms for now. (We probably have to document all current violations as exceptions for the time being and work them out one by one (think OMIT/BTO classes and application ontologies). )
We implement the rule you suggest (MUST NOT subclass), and add it to OBO dashboard.
From that point on, subclassing a term from a different namespace (other than COB, RO, BFO) can only happen with a specific annotation property (like exclusion reason, but "subclass permission") which points to a resolvable issue tracker items that explains the exception.

It is important to implement this rule independent of all existing violations. We have to improve this moving forward and not forever point to existing violations as reasons for not moving on.

dosumis commented 2 years ago

Unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis.

This is critical. PCL defines subclasses of CL terms. Single species AOs subclass Uberon and CL...

Big ask to require this for every subClassOf axiom that breaks the rule:

From that point on, subclassing a term from a different namespace (other than COB, RO, BFO) can only happen with a specific annotation property (like exclusion reason, but "subclass permission") which points to a resolvable issue tracker items that explains the exception.

dosumis commented 2 years ago

And why are we folding COB into this issue? Isn't point 1 above more aspiration than reality for many ontology branches? (e.g. see issues around anatomical entities)

hoganwr commented 2 years ago

IAO would either have to be very permissive and/or grow to include domain-specific ICEs across numerous domains.

If the policy had been in place prior to STATO, how would things be better?

On Mon, Jul 18, 2022 at 3:01 PM David Osumi-Sutherland < @.***> wrote:

And why are we folding COB into this issue? Isn't point 1 above more aspiration than reality for many ontology branches? (e.g. see issues around anatomical entities)

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1991#issuecomment-1188144461, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55R5YKF2VDR4YKUWOVTVUWSZ7ANCNFSM534YNAFA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

cmungall commented 2 years ago

Unless permission explicitly granted, on a per-term, per-branch, or per-ontology basis. This is critical. PCL defines subclasses of CL terms. Single species AOs subclass Uberon and CL...

Yes, I mentioned the Uberon case in the original comment. There would be an agreement that species-specific subclasses are OK by a blanket rule, but if you want to make a species-neutral subclass this should be agreed first. PCL and CL is a good example, there is obviously close coordination and clear scoping rules between these two ontologies. So there would be a pairwise agreement. But I don't think CL wants extra-ontology subclasses that are neither data-driven classifications not species-specific, until new situations arise.

@hoganwr:

IAO would either have to be very permissive

There is nothing inherently wrong with this provided there is a simple process for adding new terms, for example, template-based with clear design patterns, and many people able to merge PRs. But see below for alternatives.

If the policy had been in place prior to STATO, how would things be better?

There would be clear delineation between the two ontologies. There's lots of ways to do this:

OBI has physical entities, IAO has information
IAO could itself be modularized into different domains:
1. core upper level, with guidelines on how to subclass
2. ICEs that shadow physical properties
3. statistical and mathematical concepts
4. bibliographic entities
5. legal and social entities

But simply having IAO have all information coupled with a simple process for adding new terms would be better than the current situation, with the striping between ontologies.

I am aware of some reasons why the current situation arose, I am not criticizing past decisions, but we need to move beyond these and implement clear modularity and scoping.

alanruttenberg commented 1 year ago

Just became aware of this issue. I'll register a strong objection. Let's quit proposing rules that limit what developers can put in their ontology.

addiehl commented 1 year ago

Have to agree with Alan, as most of my group's ontologies build off other ontologies. In some cases we have requested new classes from appropriate ontologies, but in other cases our classes are probably too specific for inclusion in a higher level domain ontology.

cmungall commented 1 year ago

@addiehl can you describe some of the processes you have put in place to avoid some of the issues highlighted here? It would be great to have documentation and SOPs on this and very much in the spirit of my original request!

addiehl commented 1 year ago

I have a number of examples to describe, but don't have time until next week to write this up.

OBOFoundry / OBOFoundry.github.io

Define when it is OK to subclass terms in another ontology #1991

1443