owlcollab / oboformat

Automatically exported from code.google.com/p/oboformat
5 stars 2 forks source link

Split intersection_of into separate subClassOf axioms #92

Closed alanruttenberg closed 8 years ago

alanruttenberg commented 8 years ago

In OBO if you have something like: intersection_of: only_in_taxon NCBITaxon:9606 ! Homo sapiens intersection_of: PR:000000366 ! smad4 isoform 1

The translation is currently to SubClassOf(parent ObjectIntersectionOf(PR:000000366 ObjectSomeValuesFrom(only_in_taxon NCBITaxon:9606)))

It would be better to translate to SubClassOf(parent PR:000000366) SubClassOf(parent ObjectSomeValuesFrom(only_in_taxon NCBITaxon:9606))

The reason is for sparql querying on (all too often) naive triple stores. To query for this using the current scheme you have to know how to encode a list in RDF, as well as the order of the conjuncts as expressed in the list, which is annoying. The alternative is equivalent and so easier to query.

Also, note: only_in_taxon should be a used in an ObjectAllValuesFrom but is translated currently (at least as exhibited in PRO) as ObjectSomeValuesFrom.

cmungall commented 8 years ago

On 25 Jul 2016, at 10:54, Alan Ruttenberg wrote:

In OBO if you have something like: intersection_of: only_in_taxon NCBITaxon:9606 ! Homo sapiens intersection_of: PR:000000366 ! smad4 isoform 1

The translation is currently to SubClassOf(parent ObjectIntersectionOf(PR:000000366 ObjectSomeValuesFrom(only_in_taxon NCBITaxon:9606)))

correct.

It would be better to translate to SubClassOf(parent PR:000000366) SubClassOf(parent ObjectSomeValuesFrom(only_in_taxon NCBITaxon:9606))

I'm not sure I understand the reason why, but in any case, the translation for intersection_of has been stable since 2004, it's heavily used, changing this would have catastrophic consequences for multiple stakeholders.

Note that if the above is the desired semantics, then this can be encoded in obo format using is_a and relationship tags:

id: parent relationship: only_in_taxon NCBITaxon:9606 ! Homo sapiens is_a: PR:000000366 ! smad4 isoform 1

The reason is for sparql querying on (all too often) naive triple stores. To query for this using the current scheme you have to know how to encode a list in RDF, as well as the order of the conjuncts as expressed in the list, which is annoying. The alternative is equivalent and so easier to query.

I'm not following. Surely you want to first ensure that you have the intended semantics in OWL. If the intended semantics involve equivalence axioms and intersection expressions, then the RDF is naturally harder to query. But this has nothing to do with obo format.

Btw, if you are interested in querying over the structure that has equivalence axioms relaxed to subclass of axioms, I recommend doing a robot relax before loading into the triplestore.

Also, note: only_in_taxon should be a used in an ObjectAllValuesFrom but is translated currently (at least as exhibited in PRO) as ObjectSomeValuesFrom.

Yes, it's a shortcut relation.


You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/owlcollab/oboformat/issues/92

alanruttenberg commented 8 years ago

I'm curious to know a single example of catastrophic failure with this change. The semantics are exactly the same. Note I should be clear that this is only to be done for top-level ObjectIntersectionOf expressions which are the object of SubClassOf axioms. So if your expression is (p some (a and b)) then this doesn't trigger.

The reason why is that if you want to do a query on a naive(or ish) triple store, in the case of using the IntersectionOf you need to know that conjunctions are encoded with rdf lists, and worry about the order of things. With intersection, query for one of the conjuncts looks like:

:B2 owl:intersectionOf :B3 . :B3 rdf:rest :B4 . :B4 rdf:rest rdf:nil . :B4 rdf:first :B5 . :B5 owl:someValuesFrom obo:NCBITaxon9606 . :B5 owl:onProperty http://purl.obolibrary.org/obo/pr#only_in_taxon . :B5 rdf:type owl:Restriction . :B3 rdf:first ?proteinclass . :B2 rdf:type owl:Class .

If split the same query is done with

_:B2 owl:someValuesFrom obo:NCBITaxon9606 . :B2 owl:onProperty http://purl.obolibrary.org/obo/pr#only_in_taxon . _:B2 rdf:type owl:Restriction .

Re: EquivalenceClasses axioms, note my request was in the case of SubClassOf axioms. Sorry if that wasn't clear. In the case of subClassAxioms the semantics of the result of transformation are as intended. Note that protege does something similar in its interface. If you say c subclass of (a and b) then c will be rendered in the hierarchy below both a and b.

I'm not familiar with robot. I will have a look when I get a chance.

Thanks for the workaround. I will pass that on.

Are shortcut relations not automatically expanded when translating from OBO to OWL? Perhaps they should be, since it is more common that the translation is used because an ontology is authored in OBO, rather than the case the translation is used because an ontology formerly edited in OBO will henceforth be edited in OWL.

alanruttenberg commented 8 years ago

Bleh. I see that intersection_of in a stanza translates to an EquivalentClasses axiom. So won't be relevant in the case I care about at the moment, since the OBO translator would seem to not generate such SubClassOf(x IntersectionOf(.... axioms.

I looked at robot (documentation in examples folder) but it looked like relax does EquivalentClasses(a,expr) -> SubClassOf(a, expr)

But not SubClassOf(a, IntersectionOf(expr1,expr2)) -> SubClassOf(a,expr1) SubClassOf(a,expr2)

Which is what I was getting at.

Thanks

cmungall commented 8 years ago

Apologies, I misread the original ticket.

You said:

 In OBO if you have something like:
 intersection_of: only_in_taxon NCBITaxon:9606 ! Homo sapiens
 intersection_of: PR:000000366 ! smad4 isoform 1

 The translation is currently to
 SubClassOf(parent ObjectIntersectionOf(PR:000000366
 ObjectSomeValuesFrom(only_in_taxon NCBITaxon:9606)))

I said this was correct, but in fact it translates to an equivalence axiom, not a subClassOf axiom.

http://owlcollab.github.io/oboformat/doc/obo-syntax.html#5.2.1

So your proposal changes the existing semantics hence my resistance.

cmungall commented 8 years ago

snap!

alanruttenberg commented 8 years ago

Yup.

So it wouldn't change anything to add the subclass axioms rather than having them replace the equivalence axiom, though it would add bulk (unquantified) to the OWL file and so might not want to be the default. Instead it is of particular use when loading the OWL into triple stores. I could go either way depending on what the more common use for the OWL file is.

Probably easy enough to have PRO do the transformation you suggest not for distribution but instead in the pipeline to load their triple store.

Thanks