owlcs / owlapi

OWL API main repository
813 stars 314 forks source link

Shared class expression in axiom with annotations causes extra triples in output #1109

Open ignazio1977 opened 1 year ago

ignazio1977 commented 1 year ago

Test provided by @balhoff in https://github.com/ontodev/robot/issues/1129

//> using scala 2.13
//> using dep "net.sourceforge.owlapi:owlapi-distribution:4.5.25"

import org.semanticweb.owlapi.model._
import org.semanticweb.owlapi.apibinding.OWLManager
import org.semanticweb.owlapi.formats.TurtleDocumentFormat
import org.semanticweb.owlapi.formats.RioTurtleDocumentFormat
import scala.jdk.CollectionConverters._
import java.io.File

val factory = OWLManager.getOWLDataFactory()
val manager = OWLManager.createOWLOntologyManager()
val r = factory.getOWLObjectProperty(IRI.create("http://example.org/r"))
val A = factory.getOWLClass(IRI.create("http://example.org/A"))
val B = factory.getOWLClass(IRI.create("http://example.org/B"))
val C = factory.getOWLClass(IRI.create("http://example.org/C"))
val restriction = factory.getOWLObjectSomeValuesFrom(r, B)
val equiv = factory.getOWLEquivalentClassesAxiom(A, factory.getOWLObjectIntersectionOf(C, restriction))
val subClassOf = factory.getOWLSubClassOfAxiom(
    A, restriction,
    Set(factory.getOWLAnnotation(factory.getRDFSComment(), factory.getOWLLiteral("comment"))).asJava)
val ontology = manager.createOntology(Set[OWLAxiom](equiv, subClassOf).asJava)
manager.saveOntology(ontology, new TurtleDocumentFormat(), IRI.create(new File("test.ttl")))
manager.saveOntology(ontology, new RioTurtleDocumentFormat(), IRI.create(new File("test-rio.ttl")))

The shared node is referenced in three places in the output - two axioms, one of them with annotations. The annotation triggers reification, which is where the third reference appears.

The ontology looks like this:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@base <http://www.w3.org/2002/07/owl#> .
[ rdf:type owl:Ontology ] .
<http://www.semanticweb.org/owlapi/test#r> rdf:type owl:ObjectProperty .
<http://www.semanticweb.org/owlapi/test#A> rdf:type owl:Class ;
    owl:equivalentClass [ owl:intersectionOf ( <http://www.semanticweb.org/owlapi/test#C>
      1)   _:genid4 rdf:type owl:Restriction ;
      2)      owl:onProperty <http://www.semanticweb.org/owlapi/test#r> ;
      3)      owl:someValuesFrom <http://www.semanticweb.org/owlapi/test#B>
        ) ;
        rdf:type owl:Class
        ] ;
rdfs:subClassOf _:genid4 .
_:genid4 rdf:type owl:Restriction ; owl:onProperty <http://www.semanticweb.org/owlapi/test#r> ; owl:someValuesFrom <http://www.semanticweb.org/owlapi/test#B> .
[ rdf:type owl:Axiom ;
   owl:annotatedSource <http://www.semanticweb.org/owlapi/test#A> ;
   owl:annotatedProperty rdfs:subClassOf ;
   owl:annotatedTarget _:genid4 ;
   rdfs:comment "comment"
 ] .
<http://www.semanticweb.org/owlapi/test#B> rdf:type owl:Class .
<http://www.semanticweb.org/owlapi/test#C> rdf:type owl:Class .

The node triples are outputted twice - same in turtle as in rdf/xml. This is redundant but, at the RDF level, it's just a repetition of the same triples (setting aside for a moment the impact on serialized histories).

It shouldn't make a difference to the parser. In fact, the rdf/xml parser copes with it. However, the Turtle parser doesn't like it - it doesn't expect an inline description with an id. Hard to change, as it's a JavaCC generated parser.

However, the solution needs to be that the extra triples aren't outputted. The fact that they're outputted twice, and not three times, suggests that the mechanism for deciding if the id gets generated and the one for deciding if the triples are outputted are getting tangled.

ignazio1977 commented 1 year ago

Problem was not the desharing of nodes; rather, when a node is referred in multiple places but should be output only once, the renderer 'defers' it. The node in question, _:genid4, was being deshared and deferred correctly. However, the renderer was not checking for deferment when rendering list objects - i.e., the list of arguments to the intersecionOf expression.

So, we had a test covering this already, but it didn't cover shared nodes as part of lists.

Same issue in RDF/XML, however the same fix doesn't seem to work. The two renderers are almost structurally identical. Almost.