information-artifact-ontology / ontology-metadata

OBO Metadata Ontology
Creative Commons Zero v1.0 Universal
19 stars 8 forks source link

General principle: Avoid individuals in favour of vanilla IRIs #81

Open matentzn opened 2 years ago

matentzn commented 2 years ago

In this ticket, I would like to advocate for the use of regular untyped IRIs over individuals in OMO.

Right now, many ontologies using OBO format cannot use OMO obsoletion reasons because OBO format does not support these. Which means, Mondo, Uberon and many other ontologies cannot even import OMO if they want to use obsoletion reasons!

This is how a diff looks after changing an obsoletion reason from a string to an individual:

image

I would like to support the motion of generally not using individuals at all for non logical, i.e. meta, modelling. Let's hear the avalanche of counterarguments - including please some kind of idea of how to solve the problem with OBO format ontologies.

(btw, the OBO format problem also means that if you export your ontology to OBO and dump the owl-axioms header, which ODK and most other pipelines will do, you will lose your obsoletion reasons and potentially a lot of other axioms).

bpeters42 commented 2 years ago

In a different context, we had debated about the status of OBO format, and decided that we would use OWL as the canonical format for which we write validation code etc. and if someone wants to use OBO, they are responsible for doing a conversion to OWL first.

In the same spirit, I am not sure that it is a good idea to change our modeling conventions in OWL to enable representation in OBO. Why isn't the OBO format updated to be able to represent individuals instead? Will we forever be bound by backward compatibility to a standard that as I understand it is not actively maintained?

On Thu, Dec 16, 2021 at 2:38 AM Nico Matentzoglu @.***> wrote:

In this ticket, I would like to advocate for the use of regular untyped IRIs over individuals in OMO.

Right now, many ontologies using OBO format cannot use OMO obsoletion reasons because OBO format does not support these. Which means, Mondo, Uberon and many other ontologies cannot even import OMO if they want to use obsoletion reasons!

This is how a diff looks after changing an obsoletion reason from a string to an individual:

[image: image] https://user-images.githubusercontent.com/7070631/146355628-608cc0bd-7c20-4db5-aab2-02cbbd4cbab5.png

I would like to support the motion of generally not using individuals at all for non logical, i.e. meta, modelling. Let's hear the avalanche of counterarguments - including please some kind of idea of how to solve the problem with OBO format ontologies.

(btw, the OBO format problem also means that if you export your ontology to OBO and dump the owl-axioms header, which ODK and most other pipelines will do, you will lose your obsoletion reasons and potentially a lot of other axioms).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/81, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2ISBGHR2L7KXNSPOKKLURG6SJANCNFSM5KF6HCRA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

matentzn commented 2 years ago

Maybe not actively maintained.. But widely used.. Too widely, I agree. Let's see what other say about this.

cmungall commented 2 years ago

All this will do is create a fork where many ontologies use an alternative.

But not just obo. Individuals cause problems in multiple places eg SLME. Many browsers still have issues.. We have to balance ontological perfectionism with pragmatic concerns of lack of resources.

Thankfully this is a red herring. There is no need for individuals in omo. The existing obsoletion reason model was based on a philosophical paper afaicr. I have never found it useful or applicable. Let's first design a new obsoletion reason data model based on requirements and if there are requirements to use individuals we can look at the tradeoffs

matentzn commented 2 years ago

I think the basic obsoletion model in OMO is ok, basically just saying:

?term IAO:has_obsolesence_reason IAO:terms_merged.

Not that philosophical IMP - and I think standardising the obsolescence reasons is a good idea.

The idea for using individuals was so you can select them easier in protege and provide additional metadata on them, like a description. Its sort of analogous to what the GO world did to the subsets and synonym types.

W/o getting into individuals vs not: can you describe a bit what you think about the current IAO:has_obsolesence_model? What are the concrete problems and counter proposals?

cmungall commented 2 years ago

The idea for using individuals was so you can select them easier in protege and provide additional metadata on them, like a description. Its sort of analogous to what the GO world did to the subsets and synonym types.

Actually this doesn't work with the current structure - for this to work you would need to instantiate terms_merged

cmungall commented 2 years ago

https://www.ebi.ac.uk/ols/ontologies/iao/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000225&viewMode=All&siblings=false

Equivalent to: {failed exploratory term, placeholder removed, terms merged, term imported, term split}

This is a very rigid enum, it explicitly forbids me from adding my own, and is an odd mix: very practical such as 'terms merged' (which is great) and oddly specific (the first two; the second isn't defined. I have never needed to use these two and I have obsoleted a lot of terms in my time)

cmungall commented 2 years ago

So I recognize the general utility of enums in a data model like OMO

There are other ways to do it than permissible values as individuals though https://www.w3.org/TR/swbp-specified-values/

matentzn commented 2 years ago

I never saw this equivalence class axiom. Sorry I was not really aware of all this modelling. What you are saying makes sense.. Lets think in this direction!

alanruttenberg commented 2 years ago

So I recognize the general utility of enums in a data model like OMO

There are other ways to do it than permissible values as individuals though https://www.w3.org/TR/swbp-specified-values/

Which part of that document were you referring to? The method of using existentials? Is that easier in OBO format? Is it even plausible for marking obsolescence?

I'll add a +1 to Bjoern's comment. We shouldn't be adjusting OWL for the benefit of OBO. Doing this sort of thing will only become more expensive over time.

You can use only URLs instead of individuals with a minor loss of expressivity - the oneOf closed world constraints. That might not be a bad idea to avoid the premature closing that you point out.

Annotations can be to(or from) IRIs that aren't necessarily individuals. Basically it just means dropping the named individual assertion. The thing is that I don't know how to work with such things in Protege. Take the below ontology. If you open it in Protege and look for usage of p it doesn't show the triple. I don't know how to author it in Protege either.

<?xml version="1.0"?>
<rdf:RDF 
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:example="http://example.com/">
    <owl:Ontology rdf:about="urn:lsw:ontology:foo"/>
    <owl:AnnotationProperty rdf:about="http://example.com/p"/>
    <rdf:Description rdf:about="http://example.com/s">
        <example:p rdf:resource="http://example.com/o"/>
    </rdf:Description>
</rdf:RDF>
matentzn commented 2 years ago

@alanruttenberg That is close to what I would suggest as well. Using IRIs instead of individuals. As long as the annotated entity is a class, the annotation will show up just fine in Protege, and we can use QC routines to ensure the IRIs are valid. The only issue is that if the individuals are actually in OMO, we cannot not use them. So every ontology seeking to export to OBO format, or even built using it, will have to import a special OMO module without these assertions, which is fine by me, but a bit inconvenient.

Generally, I feel like I keep getting pushed back and forth between the OBO/IAO culture and the GO culture - I am spending a lot of energy now to try and find generally acceptable ways to do things like synonyms, obsoletion, provenance and attribution, and I think the one thing we all have to do is step back and forget about current practices. Honestly - I do not care how we do it, individuals, IRIs, some strange ENUM syntax, whatever - I just care that we all do it the same way. But, the truth is, GO-culture does it one-way and OBI/IAO does it another, and neither one wants to change. @cmungall keeps telling me: "And who will pay for this? There are so many infrastructure concerns!" OBI/IAO keeps saying: "Why should we change? Our solution is great! That is what we want! Why should we do something worse?"

Maybe the one thing we need to figure out before we continue this exhausting quest for common meta-modelling, is this: does everyone agree that a common solution is better than divergent solutions even if it means that the common solution is 20% worse than the divergent? Because the common solution will be a compromise, it will be worse. If not, I will just leave this part of my engagement in OBO and keep focusing on other issues, which is fine by me - there are plenty.

Obviously, you can argue that this issue does not have to be solved on OBO Foundry-level either - an entirely different approach would be to simply allow everyone to do what they want and every camp defining rules to translate the external metadata into internal. This is fine for GO, and OBI/IAO - but it is not what I would expect from an Open Standards body like OBO Foundry.

@zhengj2007 and I have started preparing an OMO workshop for next year, and I would really appreciate at least a show of hands what the attitude to the standardisation efforts are. Will we accept a slightly worse but common solution?

cmungall commented 2 years ago

The obo format thing is complete red herring. This roundtrips:

import: http://purl.obolibrary.org/obo/omo.owl

[Term]
id: X:1
name: x1
is_obsolete: true
relationship: foo IAO:0000227 ! terms merged

[Typedef]
id: foo
name: foo
is_metadata_tag: true

Nevertheless, my point still stands - individuals are poorly supported across the rest of the stack we work with, they don't work with SLME, there are outstanding OLS issues etc, so we have to weigh the benefits (in this case minimal) against the drawbacks

Using uncommitted IRIs is a potentially good possibility but as you say this has some undefined behavior with other parts of our stack

On Sun, Dec 19, 2021 at 4:03 PM Alan Ruttenberg @.***> wrote:

So I recognize the general utility of enums in a data model like OMO

There are other ways to do it than permissible values as individuals though https://www.w3.org/TR/swbp-specified-values/

Which part of that document were you referring to? The method of using existentials? Is that easier in OBO format? Is it even plausible for marking obsolescence?

I'll add a +1 to Bjoern's comment. We shouldn't be adjusting OWL for the benefit of OBO. Doing this sort of thing will only become more expensive over time.

You can use only URLs instead of individuals with a minor loss of expressivity - the oneOf closed world constraints. That might not be a bad idea to avoid the premature closing that you point out.

Annotations can be to(or from) IRIs that aren't necessarily individuals. Basically it just means dropping the named individual assertion. The thing is that I don't know how to work with such things in Protege. Take the below ontology. If you open it in Protege and look for usage of p it doesn't show the triple. I don't know how to author it in Protege either.

<?xml version="1.0"?> <rdf:RDF xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:example="http://example.com/">

<owl:AnnotationProperty rdf:about="http://example.com/p"/>
<rdf:Description rdf:about="http://example.com/s">
    <example:p rdf:resource="http://example.com/o"/>
</rdf:Description>

</rdf:RDF>

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/81#issuecomment-997488246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMON2JTMZQYDZ6HUH2E3URZXG5ANCNFSM5KF6HCRA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: <information-artifact-ontology/ontology-metadata/issues/81/997488246@ github.com>

cmungall commented 2 years ago

Proposal 1:

  1. Model obsolescence reasons as classes
  2. Link obsolete terms to these classes via existing annotation property (has-obsolescence-reason http://purl.obolibrary.org/obo/IAO_0000231)

Advantages

  1. Allows a complete taxonomy of reasons that can be evolved and refined over time
  2. We have precedence for this pattern, e.g. our taxonomy of ontology module types: http://purl.obolibrary.org/obo/IAO_8000000
  3. Simple to query and to curate in Protege
  4. Not known to introduce any technical issues

Disadvantages

  1. entailments not within DL (i.e sparql queries would need to explicitly include subClassOf* paths when querying obsolescence reasons)
  2. range constraints on AP not enforced in OWL-DL
  3. if we want to later include specific properties for specific reasons doesn't provide a way to restrict this (but this would likely need done in shacl/linkml anyway)

Proposal 2:

  1. Create a different property for each reason
  2. on a case by case basis, the range would be either boolean or a link to further information

For example, consider the current case of http://purl.obolibrary.org/obo/IAO_0000228 'term imported' -- This is to be used when the original term has been replaced by a term imported from an other ontology. An editor note should indicate what is the URI of the new term to use.

Instead, with this proposal would have a property something like 'obsoleted because now imported from' with a range of an onology IRI

Two variants of this proposal: DPs and APs. DPs come with the usual problems of punning, APs with the limitation that enforcement/entailment is outside DL

Advantages:

  1. Allows composition of reasons
  2. facilitates auto-QC. E.g. we want to enforce obsoletions due to terms being imported from somewhere else include the IRI of the new term or the new ontology

Disadvantages

  1. potential proliferation of properties
  2. slightly more mouse clicks in Protege

Message ID:

<information-artifact-ontology/ontology-metadata/issues/81/997488246@ github.com>

alanruttenberg commented 2 years ago

On Mon, Dec 20, 2021 at 11:01 AM Chris Mungall @.***> wrote:

Proposal 1:

  1. Model obsolescence reasons as classes
  2. Link obsolete terms to these classes via existing annotation property (has-obsolescence-reason http://purl.obolibrary.org/obo/IAO_0000231)

Advantages

  1. Allows a complete taxonomy of reasons that can be evolved and refined over time
  2. We have precedence for this pattern, e.g. our taxonomy of ontology module types: http://purl.obolibrary.org/obo/IAO_8000000
  3. Simple to query and to curate in Protege
  4. Not known to introduce any technical issues

Disadvantages

  1. entailments not within DL (i.e sparql queries would need to explicitly include subClassOf* paths when querying obsolescence reasons)
  2. range constraints on AP not enforced in OWL-DL
  3. if we want to later include specific properties for specific reasons doesn't provide a way to restrict this (but this would likely need done in shacl/linkml anyway)

If we have a lot of reasons for obsoleting a term, seems to me like we're doing something wrong. Terms should rarely be obsoleted. The basic cases are that it doesn't refer - a scientific mistake discovered, or a duplicate has been added accidentally. In addition, ontologically, a reason by itself is an individual. Modeling it as a class is ... modeling. The sort of thing introduced in data modeling, which is not what we're doing.

If you really want a class, make it an actual class, with individuals that are distinguished in some way from each other. For instance, a proposition saying that adding term X to the ontology was a mistake is a class. Each proposition differs from the other because there's a different subject. Representing that collection as a class makes sense. But don't.

Proposal 2:

  1. Create a different property for each reason
  2. on a case by case basis, the range would be either boolean or a link to further information

If there's an actual relation maybe, but the boolean bit is another data modeling thing. But here's the thing. There's a reason to keep this stuff out of the logical assertions. Ideally everything we make logical assertions about should be sound and sensible from an ontological point of view. But 'has obsolescence reason' isn't. It's not that there are instances of the obsolete class and it even if that made sense they wouldn't be the subject. has obsolescence reason answers the question why was this obsoleted. Obsoleting a term is a process. The reason is the reason for the process having happened.

If we're doing annotations then we have leeway for abuse like this. But that sort of thing should be minimized. We already have one bogus incursion into the logic - the obsolete terms are still declared classes (ideally they would be bare IRIs) and they are asserted subclass of obsolete class (IIRC). Ok, we held our noses and did that and it's a done deal. But let's not make things worse.

For example, consider the current case of http://purl.obolibrary.org/obo/IAO_0000228 'term imported' -- This is to be used when the original term has been replaced by a term imported from an other ontology. An editor note should indicate what is the URI of the new term to use.

Instead, with this proposal would have a property something like 'obsoleted because now imported from' with a range of an onology IRI

Two variants of this proposal: DPs and APs. DPs come with the usual problems of punning, APs with the limitation that enforcement/entailment is outside DL

I think enforcement outside DL is a better choice in this case, if this is used at all. There are already things like checks for things that you need to have stated (so called integrity constraints in the DL world) so there's precedent.

Advantages:

  1. Allows composition of reasons

too.many.reasons

The oneof in this case wasn't entirely without thought. It's part of the philosophy being the ontology why we obsolete a term. There was reason to believe this was a small closed set.

To my mind any of these data modeling incursions into ontology are corrosive and breed more abuse. The least intrusive thing would be to use an annotation property and annotate to IRIs rather than individuals. If I had to do it again that's what I would do - live and learn. The project of building ontologies is distinctive only if there are principles behind it. Keeping to the principles is hard. Any distractions, any precedents for ignoring good ontology thinking has a negative impact in the long term because it encourages more of the same.

  1. facilitates auto-QC. E.g. we want to enforce obsoletions due to terms

    being imported from somewhere else include the IRI of the new term or the new ontology

    Disadvantages

    1. potential proliferation of properties
    2. slightly more mouse clicks in Protege

    Message ID: <information-artifact-ontology/ontology-metadata/issues/81/998055804@ github.com>

bpeters42 commented 2 years ago

As I was the one who raised the concerns about basing modeling decisions on the OBO format, I want to say that I completely agree with Chris that it was a red herring. I reacted to the initial issue raised which seemed to say: 'we have to change this to work with OBO format', without understanding the technical details. But now I get that there are issues with individuals in general, and that the way obsoletion reasons were modeled in IAO was far from perfect. I am very much in favor of creating a solution in OMO that is technically light-weight and works for all.

On Mon, Dec 20, 2021 at 8:01 AM Chris Mungall @.***> wrote:

The obo format thing is complete red herring. This roundtrips:

import: http://purl.obolibrary.org/obo/omo.owl

[Term]
id: X:1
name: x1
is_obsolete: true
relationship: foo IAO:0000227 ! terms merged

[Typedef]
id: foo
name: foo
is_metadata_tag: true

Nevertheless, my point still stands - individuals are poorly supported across the rest of the stack we work with, they don't work with SLME, there are outstanding OLS issues etc, so we have to weigh the benefits (in this case minimal) against the drawbacks

Using uncommitted IRIs is a potentially good possibility but as you say this has some undefined behavior with other parts of our stack

On Sun, Dec 19, 2021 at 4:03 PM Alan Ruttenberg @.***> wrote:

So I recognize the general utility of enums in a data model like OMO

There are other ways to do it than permissible values as individuals though https://www.w3.org/TR/swbp-specified-values/

Which part of that document were you referring to? The method of using existentials? Is that easier in OBO format? Is it even plausible for marking obsolescence?

I'll add a +1 to Bjoern's comment. We shouldn't be adjusting OWL for the benefit of OBO. Doing this sort of thing will only become more expensive over time.

You can use only URLs instead of individuals with a minor loss of expressivity - the oneOf closed world constraints. That might not be a bad idea to avoid the premature closing that you point out.

Annotations can be to(or from) IRIs that aren't necessarily individuals. Basically it just means dropping the named individual assertion. The thing is that I don't know how to work with such things in Protege. Take the below ontology. If you open it in Protege and look for usage of p it doesn't show the triple. I don't know how to author it in Protege either.

<?xml version="1.0"?> <rdf:RDF xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:example="http://example.com/">

— Reply to this email directly, view it on GitHub < https://github.com/information-artifact-ontology/ontology-metadata/issues/81#issuecomment-997488246 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAAMMON2JTMZQYDZ6HUH2E3URZXG5ANCNFSM5KF6HCRA . Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . You are receiving this because you commented.Message ID:

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/81#issuecomment-998055687, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IVLZ5X2SAU5UW5BOWTUR5HN7ANCNFSM5KF6HCRA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: <information-artifact-ontology/ontology-metadata/issues/81/998055687@ github.com>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters