oborel / obo-relations

RO is an ontology of relations for use with biological ontologies
http://oborel.github.io/
Other
92 stars 47 forks source link

New relation: some C 'has [increased, decreased] Q relative to' some (other) C #155

Open pbuttigieg opened 7 years ago

pbuttigieg commented 7 years ago

PATO has several relational qualities that refer to some norm, where norm is also a quality. For example, consider increased mass density

dense and (increased_in_magnitude_relative_to some normal)

Over at ENVO, we'd like to be able to express that some continuant (mostly material entities) has increased/decreased qualities relative to some other continuant. For example:

firn 'has increased mass density relative to' some 'powdery snow'

This entails quite a bit of PATO mirroring, but would really help us with axiomatisation. Is this something RO could handle? Perhaps there's a way to express this with existing PATO formulations?

cmungall commented 7 years ago

We could have a design pattern for shadowing PATO classes as RO OPs.

Note that class axioms may be weaker than you think:

firn SubClassOf 'has increased mass density relative to' some 'powdery snow'

means every firn has less md than some powdery snow. In statistical terms you're talking about the entire distribution for the LHS relative to the extreme lower tail of the RHS. The truth value of the statement rests on outliers.

cmungall commented 7 years ago

Some other options:

pbuttigieg commented 7 years ago

Note that class axioms may be weaker than you think:

firn SubClassOf 'has increased mass density relative to' some 'powdery snow'

means every firn has less md than some powdery snow. In statistical terms you're talking about the entire distribution for the LHS relative to the extreme lower tail of the RHS. The truth value of the statement rests on outliers.

This I didn't realise, and it doesn't seem to capture that the effect is in the locations of each distribution along the Q. Is there a stronger way to assert this or something which pertains to the locations/averages?

Some other options:

  • Make these annotation properties (no OWL semantics) and build in the semantics into a separate (possibly statistical) validation or reasoning procedure
  • Introduce 'canonical individuals' or 'subclasses within the typical range'

While I think that the thresholds themselves should be on the data/information layer (sort of precluding option 2 for fear of mission creep), the knowledge layer should have some actionable content saying where one expects the (location of) the magnitude of a Q inhering in these entities to fall in relation to one another.

build in the semantics into a separate (possibly statistical) validation or reasoning procedure

Is there an example of this? If this is something we can manage within a normal release cycle, I'm not averse to researching and maintaining it.

cmungall commented 7 years ago

I agree that generally the thresholds belong in a separate layer (does this always apply PM2.5 in ENVO for example?)

OWL has no notion of averages, but the canonical individual idea might get at that. We could say that for every class X we have an individual Xi related via an annotation property we define (this would be invisible in OWL), and we can relate these individuals via comparative relations.

Or, keeping it simpler we can go back to your original request and make reciprocal statements:

Which makes it slightly stronger... but is harder to maintain

It depends what we want out of this. It's probably not going to be so useful for querying, and it's hard to see how it could be useful for validation with OWL reasoning. Perhaps with a large enough network of comparative relations you can use properties like anti-symmetry to infer where cycles occur in these graphs, but it doesn't seem like it would occur so often.

build in the semantics into a separate (possibly statistical) validation or reasoning procedure Is there an example of this? If this is something we can manage within a normal release cycle, I'm not averse to researching and maintaining it.

None (though we have some examples of simple AP assertions expanding into more complex OWL expressions). Simple creating the AP is not hard. We could define it according to your averages idea something like:

avgQ(X) is the average value of all instances of Q that inheres in some instance of X

then for any relational quality CQ, where C is the comparator

X CQ Y =def
X and Y are classes,
avgQ(X) < avgQ(Y)  IF C='<'
avgQ(X) < avgQ(Y)  IF C='>'

Implementing a complete reasoner is very hard but not necessary. Some valid but incomplete procedures could be implemented in SPARQL. E.g.

SELECT ?x WHERE {
?x cq+ ?x
}

would return problematic classes where there were cycles. This could be extended with rdfs:subClassOf etc. This only makes sense if you expect a large network of comparative relations.

pbuttigieg commented 7 years ago

does this always apply PM2.5 in ENVO for example?

You're right: in this case there's a class linked to a threshold defined in environmental monitoring regulation. This came from requests linked to SDGIO, which often insist on policy alignment. So we do have exceptions, which I'm not a fan of. Perhaps we can move that class to SDGIO and have the data-threshold-free superclass in ENVO.

It depends what we want out of this. It's probably not going to be so useful for querying, and it's hard to see how it could be useful for validation with OWL reasoning.

Perhaps not immediately, but it can be used to better coordinate different terminologies with conflicting ranking down the line. I could also imagine queries along the lines of "what sediment types have porosity greater than sedimentTypeX?" in science and engineering contexts. These sorts of queries could be helpful in planning field sampling or finding out which environmental samples to compare.

I agree that trying to set up something elaborate is premature, but solutions which support the SPARQL query you drafted would be useful. The reciprocal subclassOf statements seem quite straightforward, even if a bit of burden.

nlharris commented 3 years ago

What's the status of this? Is there a request for a new relation(s)?

cmungall commented 3 years ago

Still open to it if really desired, but I think one of the other options is best. We're planning on using exemplar instances in CL

On Thu, Oct 15, 2020 at 1:28 PM Nomi Harris notifications@github.com wrote:

What's the status of this? Is there a request for a new relation(s)?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/oborel/obo-relations/issues/155#issuecomment-709572268, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOLDFTCIFKX5PK4JVFDSK5LN5ANCNFSM4DHEXDIQ .

nlharris commented 3 years ago

@pbuttigieg would one of the other options work for you?

pbuttigieg commented 3 years ago

I don't know @nlharris @cmungall

The need is simple and valid: we need a way to say that instances of one class typically have increased/decreased magnitudes of some quality relative to another class.

The above is all over the place. What's the most direct and implementable way to do that?

We're planning on using exemplar instances in CL

Great for CL - what's the OBO position on this?

matentzn commented 3 years ago

My 2 cents (although I would ignore them if I were you): Trying to model a "Sometimes ALL-ALL" (all instances of firn are denser than all instances of powdery snow - most of the time) situation is just not OWL - too much can go wrong here. I agree that at least capturing this can be very useful, so I would personally recommend capturing this using APs and then define custom inference on top of that using SPARQL. Also, given a simple triple pattern with APs, we can trivially transform this to OWL axioms using SPARQL construct for OWL based QC. So you could match the ?A 'has increased mass density relative to' ?B triple and rewrite it, with SPARQL construct, to a a set of subclass axioms:

A SubClassOf RQ some B
B SubClassOf Inv(RQ) Some A

In any case, not sure here.

wdduncan commented 3 years ago

@matentzn your 2 cents is worth a million dollars to me :)

If statistical outliers are a concern, I suppose you can use the word "typically" in the relation name; e.g. 'typically has increased mass density relative to'.

If that seems overly burdensome, I like the idea of using canonical/exemplar individuals. It allows for straightforward modeling such as:

:exemplar_firn 'has increased mass density relative to' :exmplar_powdery_snow

AFAIK, there isn't an OBO Foundry position on using exemplars. So, I wouldn't be too concerned. There are perhaps some interesting questions about the metaphysical status of canonical individuals (e.g., Are they concrete or abstract?), but waiting for metaphysical guidance on this may take years.

A relational quality may also be useful. E.g., (and I'm probably going to the exact representation wrong ... sorry):

'has increased mass density relative to'  rdfs:subClassOf 'relational quality' .
'has increased mass density relative to' 'inheres in' some firn .
'has increased mass density relative to' towards some 'powdery snow' .

Using OWL rollification may be an option, but that rabbit hole might not be worth exploring.

cmungall commented 3 years ago

To echo what @matentzn says: the need may sound "simple and valid", but OWL is simply not a good system for this kind of knowledge.

There is a really good paper by Alan Rector, Stefan Schulz, Jean MarieRodrigues, Chris Chute and @hsolbrig

On beyond Gruber: “Ontologies” in today’s biomedical information systems and the limits of OWL https://www.sciencedirect.com/science/article/pii/S2590177X19300010#b0440

Although the focus is biomedical/clinical systems, what they talk about is highly applicable here (defeasible statements). I recommend reading the whole paper. This is the combined wisdom of experts each with multiple decades of experience modeling using DLs and other systems.

You can skip to "5.1. Alternatives consistent with OWL semantics but which are unlikely to capture the intended meaning or scale well"

@wdduncan - Rector et al do talk about this approach of prepending the modality onto the relation, but find it wanting, at least for object properties

ultimately there are no great solutions in OWL, but a number of imperfect solutions can be proposed, and the one to be selected depends heavily on your use case - not just your use case of what you as an ontologist want to say, but the use cases of how users will use the overall system.

I will post my proposal for a plan moving forward in the next comment

cmungall commented 3 years ago

Proposal:

We will add to RO, two sets of properties shadowing a selected subset of PATO attributes

(we can include more user-friendly labels in most cases, e.g. denser-than)

The set AP is intended to be used only with classes, although this cannot be enforced in OWL. It has no semantics according to OWL.

The set OP is, by nature of being object properties, only applicable to between individuals. Broadly speaking this permits 3 patterns of usage:

  1. between actual individuals in the realist sense, e.g. alice and bob; this portion of powdery snow and that portion of firn
  2. In punning, e.g. 'powdery snow' denser-than firn
  3. In TBox axioms, typically of the form C subClassOf OP some D

Note that use 1 lends itself well to triads of statements involving exemplars (see @wdduncan's response). I would consider this to be good usage.

I would recommend against 2 for now. Thus far the only ontology in OBO to deliberately use punning is UO, and it has caused problems. It may be the case that we can later work out and agree on best practice for punning across OBO, but for now no consensus exists on what the relation is between a class and its punned individual (OWL is of course silent on the semantic relation, and permits you to have the ENVO class for planet be a punned individual of type brain).

I would also very strongly recommend against 3 for the reasons above and in Rector et al.

So following my recommendations, for ENVO we could simply use the APs, or create exemplars and use the OPs. The APs will be the most straightforward.

With those cautions in mind, I suggest we go ahead and implement the new relations The work needing done:

Do we agree this is the best compromise moving forward?

cmungall commented 3 years ago

Addendum to the above: we can define rules for inferred the APs from compositions of OPs and exemplar relations, but this of course cannot be done in OWL-DL

matentzn commented 3 years ago

I support this suggestion. I would suggest to add a section of whether, if at all the typically_{increased,decreased}_ATTRIBUTE_compared_to should affect QC or not. For example you could define a test like this:

Given:

A SubClassOf: owl:Thing
B SubClassOf: owl:Thing
C SubClassOf: B
A typically_increased_density_compared_to B
C typically_increased_density_compared_to A

To produce a warning. So I would suggest to add to your list of action items to define the minimum work these relations should be doing for QC, unless of course merely stating them is enough for Pier.

cmungall commented 3 years ago

I agree that is good to write down the formal rules

This is actually not straightforward though, and your example gets at the heart of why this is hard, because OWL subClassOf is an invariant relation.

consider changing your example from density to size

A = brain B = cell A typically-larger-than B

From a common sense reading, this is completely fine. However, OWL deals with invariants, not common sense.

If we have

C = Chroococcus giganteus cell C subClassOf B

then you would flag:

Chroococcus giganteus cell typically-larger-than brain

But in fact this statement is fine according to many readings of "typically". The majority of brains are insect brains, which in the majority of cases will be smaller than Chroococcus giganteus cells.

One reason to use annotation assertions is precisely to defer on any logical interpretation. But deferring of course brings the main questions to the foreground: what exactly are the use of these, what are the curation guidelines for when to make them?

On Mon, May 3, 2021 at 2:04 AM Nico Matentzoglu @.***> wrote:

I support this suggestion. I would suggest to add a section of whether, if at all the typically_{increased,decreased}_ATTRIBUTE_compared_to should affect QC or not. For example you could define a test like this:

Given:

A SubClassOf: owl:Thing B SubClassOf: owl:Thing C SubClassOf: B A typically_increased_density_compared_to B C typically_increased_density_compared_to A

To produce a warning. So I would suggest to add to your list of action items to define the minimum work these relations should be doing for QC, unless of course merely stating them is enough for Pier.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oborel/obo-relations/issues/155#issuecomment-831127270, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMONIUB4VSYLKESN5PJ3TLZRKJANCNFSM4DHEXDIQ .

nlharris commented 2 years ago

I feel like versions of this have also come up in Translator.

wdduncan commented 2 years ago

@cmungall has a proposed plan for implementing this.

Have we agreed to implement it?