INCATools / relation-graph

Materialize OWL existential relations
MIT License
13 stars 4 forks source link

vocabulary for different kinds of redundancy/inference at graph and owl level #63

Open cmungall opened 2 years ago

cmungall commented 2 years ago

Capturing in this repo for now. The goal is to have a better categorization of different notions of directness or redundancy of triples in a Relation Graph. Even if these are not implemented by RG immediately, having this vocabulary will help us in discussions of what to include or exclude any given user-facing graph.

The overall goal is to be able to classify triples in a relation graph with categories that help inform users as to whether they are redundant and the nature of the redundancy. E.g. for some applications it is critical to be able to separate "one-hop" edges from edges that are redundant with a multi-hop path (classic Transitive Reduction).

Relation Graph Transforms

First we define the relation-graph transform (RGT) between an axiom pattern on the left and a triple on the right:

  1. A subClassOf B where A and B are named classes => A rdfs:subClassOf B
  2. A subClassOf R some B where A and B are named classes => A R B
  3. A equivalentClass B where A and B are named classes => A owl:equivalentClass B

(these are a subset of owlstar, with the interpretation triple excluded)

The axioms patterns are assumed to match entailed axioms in the input ontology. The original asserted axioms are labeled Oasserted.

A possible extension is to include properties and individuals in the graph, we omit this for now

The set of axioms in O that do not match either LHS pattern is called O' (e.g. RBox axioms, logical axioms with nesting or using other constructs)

We use the term "triple" when talking about the output RG triple, and "axiom" when talking about the input OWL ontology axiom. We assume only semantics at the OWL/axiom level and structure at the triple level.

Any triple in the RG must correspond to an axiom (asserted or entailed) in the input ontology according to the two transforms above

Each triple can be categorized according to one or more entailment/redundancy categories below

Additional assumption: the input OWL is coherent. Preprocessing may be applied to reach this state.

Triple categories

A triple t is asserted based on characteristics of it's corresponding axiom a, which matches one of the patterns above

asserted

The triple comes from an asserted axioms on the LHS, ie the triple matches an axiom in Oasserted

entailed

The triple comes from an entailed axiom on the LHS

note that entailed is not disjoint from asserted. Every asserted triple is necessarily an entailed triple

entailment trivially holds for all edges but we include here for completeness, e.g. we can use it to categorize unseen edges

An example of an entailed triple is A partOf C, given A partOf B, B partOf C, where partOf is transitive in O

(if partOf is not transitive, then the triple is not entailed and hence not in RG)

reflexive

iff A=B

Note that all reflexive triples SHOULD NOT be asserted for globally reflexive or locally reflexive properties, as the input ontology SHOULD NOT assert these. Presence of these indicates the input MAY come from incorrectly configured robot.

For non-reflexive properties, reflexive triples MAY be asserted. For example, from an asserted axiom neuron subClassOf connected-to some neuron

root-tip-tautological (aka taut)

Either A=Nothing or B=thing or P=topProperty or P=bottomProperty

A default configuration of RG MAY be to always exclude tautologies, and the nodes thing and nothing

entailment-direct

Given a triple t0 A R B, corresponding to axiom a0,

t0 is entailment-direct if there exists no triple t1 corresponding to axiom a1 such thatO-a1 |= a0

this has the effect of excluding both two-hops over transitive predicates, as well as one-hops with more general predicates

graph-onehop

A triple A R B is graph-direct if there exists no pair of triples A Z, Z B

Note this is predicate-blind. It should not be assumed non-onehops encode no useful information. E.g.

A sub C [2-hop] A partOf B B sub C

If the 2-hop is exclude it can exclude information that cannot be recapitulated elsehwere, and will impede operations such as inference of a category for a node by subClass traversal

graph-Nhop

Generalization of above

non-redundant predicate

a triple A R B (corresponding to axiom a0) is a nrpred iff there is no triple

t1 A R' B (corresponding to a1) such that R' != R and

O-a0 |= a0 and NOT: O-(a0 + a1) |= a0

Examples

Single hop subproperty example

input owl

A sub partOf some B
partOf sub overlaps
reflexive(partOf)

entailed owl

A sub partOf some B
partOf sub overlaps
overlaps sub topProperty
A sub Thing
B sub Thing
A sub A
B sub B
Nothing sub Nothing
Nothing sub A
Nothing sub B
bottomProperty sub partOf
bottomProperty sub overlaps
A sub partOf some A
B sub partOf some B
A sub overlaps some A
B sub overlaps some B
A sub topProperty some A
B sub topProperty some B
A sub overlaps some B
A sub topProperty some B

(we omit irrelevant patterns as the set is of infininite size, eg if we include unions)

triples in RG annotated with categories (omitting 'entailed' categorization)

A partOf B asserted, entailment-direct, graph-onehop, nrpred
A overlaps B, NON-entailment-direct, graph-onehop, redundant-pred
A sub Thing taut, entailment-direct, graph-onehop
B sub Thing taut, non-entailment-direct, non-onehop
Nothing sub A, taut, non-entailment-direct, non-onehop
Nothing sub B, entailment-direct, graph-direct
A partOf A, reflexive
B partOf B, reflexive
A sub A, reflexive
B sub B, reflexive
A overlaps A, reflexive
B overlaps B, reflexive
A topProp A, reflexive, taut
B topProp B, reflexive, taut

Note that if we select ONLY non-taut, entailment-direct from RG we get a single triple A partOf B

This is what a "typical user" might expect from the graph

Transitivity 2-hop example

input owl

A sub partOf some B
B sub partOf some C
transitive(partOf)

triples (excluding tauts and reflexive for brevity)

A partOf B: asserted, entailment-direct, graph-onehop
B partOf C: asserted, entailment-direct, graph-onehop
A partOf C: non-entailment-direct, graph-2hop

contrast with next example

Non-transitivity 2-hop example

input owl

A sub adjacentTo some B
B sub adjacentTo some C

triples (excluding tauts and reflexive for brevity)

A adjacentTo B: asserted, entailment-direct, graph-onehop
B adjacentTo C: asserted, entailment-direct, graph-onehop

Note this triple is NOT in RG:

A adjacentTo C: NON-ENTAILED, non-entailment-direct, graph-2hop

Property-chain example

input owl

A sub negReg some B
B sub negReg some C
negReg sub reg
posReg sub reg
negReg o negReg -> posRef

triples (excluding tauts and reflexive for brevity)

A negReg B: asserted, entailment-direct, graph-onehop
B negReg C: asserted, entailment-direct, graph-onehop

A reg B: non-entailment-direct, graph-onehop
B reg C: non-entailment-direct, graph-onehop

A reg C: non-entailment-direct, graph-2hop, redundant-pred
A posReg C: non-entailment-direct, graph-2hop, nrpred

The intuition here is that even though A posReg C is a 2-hop and can be entailed from existing triple axioms, it is in some sense interesting in that it is a non-redundant pred

cmungall commented 2 years ago

TODO: compare with @balhoff's pruning rules:

https://github.com/INCATools/ubergraph/blob/4f401ad8322c418447584c4edf6381f33ddd66c4/prune.dl#L29-L33

redundant(s, p, o) :- rdf(s, p, other), s != other, !equivalent(s, other), subClassOf(other, o), rdf(s, p, o).

what about if other=o?

e.g

rdf(finger,part_of,hand) subClassOf(hand,hand)

(unless subClassOf is properSubClassOf)?

redundant(s, p, o) :- rdf(s, p, o), subClassOf(s, other), rdf(other, p, o), other != o, !equivalent(other, o).

same comments as above

redundant(s, p, o) :- rdf(s, p, o), transitive(p), rdf(s, p, other), other != o, rdf(other, p, o).

do you not also have to check for other != s?

redundant(s, p, o) :- rdf(s, p, o), subPropertyOf(sub, p), rdf(s, sub, o).

this makes sense

redundant(s, RDFS_SUBCLASS_OF, s) :- rdf(s, RDFS_SUBCLASS_OF, s).

subset of reflexive case