owlcs / owlapi

OWL API main repository
828 stars 315 forks source link

OWL-API produces wrong RDF if there is an annotation appended to the axiom with anonymous expressions #874

Closed sszuev closed 5 years ago

sszuev commented 5 years ago

The bug of parsers/readers

There is a wrong situation when annotations are attached to the triple with anonymous expressions as subject and object. Consider the following testcase (checked v 5.1.11):

public static void main(String... args) throws OWLOntologyCreationException {
    OWLOntologyManager m = OWLManager.createOWLOntologyManager();
    OWLDataFactory df = m.getOWLDataFactory();

    OWLClassExpression c1 = df.getOWLDataAllValuesFrom(df.getOWLDataProperty("d"), df.getBooleanOWLDatatype());
    OWLClassExpression c2 = df.getOWLObjectSomeValuesFrom(df.getOWLObjectProperty("o"), df.getOWLThing());

    OWLAxiom a = df.getOWLSubClassOfAxiom(c1, c2, Arrays.asList(
            df.getOWLAnnotation(df.getOWLAnnotationProperty("X"), df.getOWLLiteral("x")),
            df.getOWLAnnotation(df.getOWLAnnotationProperty("Y"), df.getOWLLiteral("y"))));

    OWLOntology o = m.createOntology();
    o.add(a);
    OWLDocumentFormat format = new RDFXMLDocumentFormat();
    String res = toString(o, format);
    System.out.println(res);

        // updated checking:
        OWLOntology o2 = fromString(OWLManager.createOWLOntologyManager(), res, format);
        System.out.println(toString(o2, new TurtleDocumentFormat()));
        List<OWLAxiom> actual = o2.axioms(AxiomType.SUBCLASS_OF)
                .peek(x -> System.out.println("A:::" + x)).collect(Collectors.toList());
        assertTrue(1 == actual.size());
        assertTrue(a.equals(actual.get(0)));
        assertTrue(o2.containsAxiom(a));
}

private static String toString(OWLOntology ontology, OWLDocumentFormat format) {
    try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
        ontology.getOWLOntologyManager().saveOntology(ontology, format, out);
        return out.toString(StandardCharsets.UTF_8.name());
    } catch (OWLOntologyStorageException | IOException e) {
        throw new AssertionError(e);
    }
}

private static OWLOntology fromString(OWLOntologyManager m, String txt, OWLDocumentFormat format) {
    OWLOntologyDocumentSource src = new StringDocumentSource(txt, IRI.create("http:://fromStr"), format, null);
    try {
        return m.loadOntologyFromOntologyDocument(src);
    } catch (OWLOntologyCreationException e) {
        throw new AssertionError(e);
    }
}

private static void assertTrue(boolean ex) {
    if (ex) return;
    throw new AssertionError();
}

The possible root of problem

If there is a triple s p o, then it is possible to annotate this triple like following:

[
 a <type> ;
 owl:annotatedSource s ;
 owl:annotatedProperty p ;
 owl:annotatedTarget o ;
 ....
] .

where <type> is either owl:Axiom (for root) or owl:Annotation (for sub-annotations).

What should happen if s or o are anonymous? Well, nothing - the rule must remain the same (see https://www.w3.org/TR/owl2-mapping-to-rdf/#Translation_of_Annotations) But OWL-API duplicates the b-node with equivalent one (by OWLObjectImpl#equals(Object)). Instead a b-node reference there is a copy b-node for object position owl:annotatedTarget.

For example, the following snippet:

[ rdf:type owl:Axiom ;
  owl:annotatedSource [ rdf:type owl:Restriction ;
                        owl:onProperty <d> ;
                        owl:allValuesFrom xsd:boolean ;
                        rdfs:subClassOf [ rdf:type owl:Restriction ;
                                          owl:onProperty <o> ;
                                          owl:someValuesFrom owl:Thing
                                        ]
                      ] ;
  owl:annotatedProperty rdfs:subClassOf ;
  owl:annotatedTarget [ rdf:type owl:Restriction ;
                        owl:onProperty <o> ;
                        owl:someValuesFrom owl:Thing
                      ] ;
  <X> "x" ;
  <Y> "y"
] .

seems to be incorrect Instead, there should be the following:

_:b0    a                   owl:Restriction ;
        owl:onProperty      <o> ;
        owl:someValuesFrom  owl:Thing .

[ a                      owl:Axiom ;
  <X>                    "x" ;
  <Y>                    "y" ;
  owl:annotatedProperty  rdfs:subClassOf ;
  owl:annotatedSource    [ a                  owl:Restriction ;
                           rdfs:subClassOf    _:b0 ;
                           owl:allValuesFrom  xsd:boolean ;
                           owl:onProperty     <d>
                         ] ;
  owl:annotatedTarget    _:b0
] .
ignazio1977 commented 5 years ago

Intuitively, this solution makes sense; however, when parsing, troubles come up.

(This is correct in RDF; it's the OWL mapping that causes problems)

The main problem is that here there's a set of triples with _:b0 as a subject, and it must be used in the construction of more than one axiom. But in the mapping definition, this is allowed for anonymous individuals not class or property expressions. For these expressions, each triple is supposed to be used once and then consumed. So, the result of the parsing is an error because the second use of the expression identifier finds to triples to match.

Details on the structure sharing limitations here: https://www.w3.org/TR/2004/NOTE-owl-parsing-20040121/#subsec-structure-sharing

There might be another solution here - I suspect another bug with parsing of annotated axioms. Unfortunately, this area of the code base has proven really complex to get right. I have not run the numbers, but I'm sure these are the classes with the largest number of recorded bugs in the whole library.

ignazio1977 commented 5 years ago

Running your code, I think this is related to another issue raised recently #875

If I change the annotation property IRI strings to be absolute IRIs, the test succeeds. So, it appears that the parsing of relative IRIs is not working properly, and I suspect this would fail the same way as undeclared annotation properties.

ignazio1977 commented 5 years ago

The error still occurs by leaving the annotation properties with full IRIs and using

OWLClassExpression c1 = df.getOWLDataAllValuesFrom(df.getOWLDataProperty("d"), df.getBooleanOWLDatatype());

I'll check that avenue first.

ignazio1977 commented 5 years ago

When a data property declared as <d> is parsed, the xml:base attribute is used to turn the relative IRI into an absolute IRI - this is expected and as dictated by the specs. See https://www.w3.org/TR/rdf11-concepts/#section-IRIs

The test fails because the axiom that is expected to be found has been created with the relative IRI, but the parsed ontology contains the absolute IRI.

Test method below (for convenience this uses methods from the TestBase class)

@Test
public void testRelativeIRIs() throws OWLOntologyCreationException, OWLOntologyStorageException {

    OWLClassExpression c1 =
        df.getOWLDataAllValuesFrom(df.getOWLDataProperty("d"), df.getBooleanOWLDatatype());
    OWLClassExpression c2 =
        df.getOWLObjectSomeValuesFrom(df.getOWLObjectProperty("urn:test:o"), df.getOWLThing());

    OWLAxiom a = df.getOWLSubClassOfAxiom(c1, c2, Arrays.asList(
        df.getOWLAnnotation(df.getOWLAnnotationProperty("urn:test:X"), df.getOWLLiteral("x")),
        df.getOWLAnnotation(df.getOWLAnnotationProperty("urn:test:Y"), df.getOWLLiteral("y"))));

    OWLOntology o = m.createOntology();
    o.add(a);

    OWLClassExpression c3 = df.getOWLDataAllValuesFrom(
        df.getOWLDataProperty("http://www.w3.org/2002/07/d"), df.getBooleanOWLDatatype());
    OWLClassExpression c4 =
        df.getOWLObjectSomeValuesFrom(df.getOWLObjectProperty("urn:test:o"), df.getOWLThing());

    OWLAxiom a2 = df.getOWLSubClassOfAxiom(c3, c4, Arrays.asList(
        df.getOWLAnnotation(df.getOWLAnnotationProperty("urn:test:X"), df.getOWLLiteral("x")),
        df.getOWLAnnotation(df.getOWLAnnotationProperty("urn:test:Y"), df.getOWLLiteral("y"))));

    OWLDocumentFormat format = new RDFXMLDocumentFormat();
    OWLOntology o2 = roundTrip(o, format);

    // This is the axiom with absolute IRIs, test passes
    assertTrue(o2.containsAxiom(a2));
    // This is the axiom with relative IRIs, and fails
    assertTrue(o2.containsAxiom(a));
}
sszuev commented 5 years ago

Intuitively, this solution makes sense; however, when parsing, troubles come up.

There is a specification, not solution. There is no other way to annotate the statement, only preserve a reference to that statement. If there is no reference (as a reified statement) in owl:Axiom b-node to the annotated statement, how these two things can be bonded together? Don't you see? It is RDF sence, not OWL or OWL-API magic. Or what do u mean?

Running your code, I think this is related to another issue

It is not. Different nature. Here - chronic problems with reading RDF, that are of an architectural nature. OWL-API do not use standard single approach, like ONT-API, there is no true-graph underlying. There are a lot of other known problems with reading RDF, such as custom []-list from spin-ontologies. This problem - just surfaced in a conversation Also, please check other syntaxes with the same code, not only for RDF/XML, I guess you will find more similar but different incorrect things, or, sometimes, correct.

PS. From your ref (https://www.w3.org/TR/owl-parsing/#subsec-structure-sharing):

In this case, we are not allowed to "re-use" the blank node,

They did not mean annotations. And - this reference - seems to be not about OWL2. It is outdated.

And, although, it is offtopic, but that last proposition is rather suspicious. Don't know why they prohibit sharing, but it increases the the size of RDF, and, therefore, complexity of computation. In the internet there are a lot of big ontologies that can be compressed at times (but usually due to another thing related to annotation assertions produced by protege).

ignazio1977 commented 5 years ago

And, although, it is offtopic, but that last proposition is rather suspicious. Don't know why they prohibit sharing, but it increases the the size of RDF, and, therefore, complexity of computation.

Look, let me describe exactly the problems I've found with the specs in the area of annotations and triple based syntaxes. It's off topic, but it doesn't matter. Comments are cheap.

The specification on how to annotate axioms is a total shambles. It is literally terrible. If the authors of the specification read this and feel this is a personal attack, I apologise. It is not personal.

The specification requires reification, and nested annotations require nested reification. Trying to read as humans, or to parse as parsers, the shambles, is a time sink. I've wasted weeks fixing bugs in this area to try to reconcile specs, usage in the wild and expectations of developers. Yes, wasted.

It is wasteful too, as you correctly observe. Looking at the size of the text required and the number of triples, one wonders if the waste was on purpose. One has to wonder.

"All right, so what's your solution?"

A reserved property and a collection of annotations, no different from how it's done for a number of other constructs.

Yes it's naive and I haven't worked out all the use cases. It is also simple to read and to parse, and it would soothe my headache. /end rant

ignazio1977 commented 5 years ago

There is no other way to annotate the statement, only preserve a reference to that statement. If there is no reference (as a reified statement) in owl:Axiom b-node to the annotated statement, how these two things can be bonded together? Don't you see?

Sorry, no, I don't see. As shown in the other comments, the annotation here is not lost in RDF/XML, and the axiom you expect to see in the roundtrip is not there because it's built with the wrong kind of IRI.

IRIs must be parsed as absolute. OWLAPI allows creating them as relative, which one could argue is a mistake, but one I cannot correct without consequences. There's an open issue for it - I haven't decided whether to fix that in version 6 or not. Personally, I'm inclined to yes.

I don't know if Turtle is supposed to allow relative IRIs or not; if yes, then there's a bug in parsing there. If not, then the bug is in rendering: the serializer should either make the IRI absolute or fail. But the structure sharing issue is unrelated.

sszuev commented 5 years ago

The test fails because the axiom that is expected to be found has been created with the relative IRI, but the parsed ontology contains the absolute IRI.

The test was written only to catch the generic problem. The RDF, produced by OWL-API is incorrect in general sense. Well, it is not surprising that in other circumstances it works ok. It is a magic - when the wrong RDF is given, but the OWL-view is correct, it should not be happen. OWL must be RDF. Any other tools - i.e. ONT-API - do not understand such magic.

The test fails because the axiom that is expected to be found has been created with the relative IRI, but the parsed ontology contains the absolute IRI.

The axiom cannot be found by any URI. Try to print out all axioms - no annotations for SubClassOf at all, but that SubClassOf is present.

Yes, the specification for annotations is ugly, cumbersome, etc. But it is correct in RDF sense.

ignazio1977 commented 5 years ago

OWL-API do not use standard single approach, like ONT-API, there is no true-graph underlying.

Yes. It's an OWL api. RDF is only one of the languages that can be used to write out ontologies. Did I claim otherwise?

ignazio1977 commented 5 years ago

it is correct in RDF sense.

Do we agree that the IRIs in RDF must be absolute?

sszuev commented 5 years ago

Yes. It's an OWL api. RDF is only one of the languages that can be used to write out ontologies. Did I claim otherwise?

OWL is RDF. RDF is not only an alternative language, it is a data structure for the same data. There should not be one different data for RDF and one different data for OWL. All these things must be about the same. Well, OWL is RDF is not always true for OWL-API parsers and readers, but these are bugs (and there are many such bugs, not only with annotations: OWL-API has no problems with the document data if the data is produced by OWL-API itself, consider any RDFS ontologies - and other bugs in issues). But for ONT-API it is always true due to the architecture principles. In this case OWL-API produces incorrect data, which cannot be handled by other tools, including ONT-API. Just because owl:Axiom and statement do not bounded to each other.

Do we agree that the IRIs in RDF must be absolute?

I agree, that testcase which I wrote to catch the problem works well if URIs are absolute.

But this is just my omission, I was tried to demonstrate the generic problem, but choose the wrong scenario. It's my mistake. Sorry. So this issue is about two bugs: The generic problem exists and need to be fixed. Also, in the original form the testcase should be handled somehow (this is minor issue).

Or, you claim that RDF is not incorrect, and any other tools must handle such RDF with annotations in the same way as OWL-API, right? In other words, they must check for equality different b-nodes. Equality exactly in OWL-API sense.

sszuev commented 5 years ago

Do we agree that the IRIs in RDF must be absolute?

For RDF/XML they have restrictions from XML specification, the same for URL. If I am not mistaken, in general case the schema part is not required, and single-letter IRIs are okay.

UPDATE: Well, I see it now (from the link given by you - https://www.w3.org/TR/rdf11-concepts/#section-IRIs): "IRIs in the RDF abstract syntax MUST be absolute, and MAY contain a fragment identifier." So I was wrong. Sorry about that. But it does not cancel most of what I said.

I admit, the provided testcase demonstrates that annotations are lost - but it contains wrong IRIs. All axioms are restored, but no annotations. Don't know what can be done with it now. Jena throws an exception in such a case. By the way, for ManchesterSyntaxDocumentFormat - it also does not work, even with absolute IRIs, but it seems to be known issue?

But, the main question, for which I wrote this mistaken test case, remains: is it correct for owl:Axioms to have a copy of b-node instead of the same b-node?

Sorry, no, I don't see.

So, you don't agree, just because no losing annotations in that testcase when IRIs are fixed, and OWL-API as usual understands correctly the document created by OWL-API itself. And the rule that the same SPO must be as reified statement is not correct in OWL-API understanding of it, although there is no explicit specification about this strange behavior, and this definitely contradicts RDF understanding.

This is quite important: in my belief Protege and OWL-API produces wrong RDF in this usecase (people are using such constructions, see https://stackoverflow.com/questions/57519885/how-to-load-axioms-inferred-from-protege-and-filter-them-by-using-owl-api). And one more belated thought (or an argument): there would be no point in such reification if SP(O*) were allowed instead of a fixed SPO. On the other hand, if such behavior is permissible, it inflicts a blow on ONT-API (including protege-like editor, which we have, and other dependent projects) and any other tools, which are built on top of the principle that OWL is RDF (well, in ONT-API I can create one more transformer to fix such cases, but this is additional calculations without a good reason).

So, if anybody may help with understanding of this case or say something about this, that would be really good.

ignazio1977 commented 5 years ago

Yes. It's an OWL api. RDF is only one of the languages that can be used to write out ontologies. Did I claim otherwise?

OWL is RDF. RDF is not only an alternative language, it is a data structure for the same data. There should not be one different data for RDF and one different data for OWL. All these things must be about the same.

Nope. Sorry, that's incorrect. Example: OWL written in functional syntax has no need of RDF. Other example: there's a mapping between OWL and RDF. If one is the same as the other, why is there a mapping?

But this is a digression.

ignazio1977 commented 5 years ago

Or, you claim that RDF is not incorrect, and any other tools must handle such RDF with annotations in the same way as OWL-API, right? In other words, they must check for equality different b-nodes. Equality exactly in OWL-API sense.

I didn't claim that. I said that, in the example you presented, the problem is not with structure sharing. I say that because, changing the IRIs in the output from relative to absolute, I get back all the annotations. Hence, regardless of correctness of the serializers, the parsers, or in general the soundness of RDF andOWL mapping in OWLAPI, the problem you showed points at a problem parsing relative IRIs.

I'm not denying there are bugs, I'm saying the one you've shown here is not the one you're describing.

As for what other RDF tools must do, who am I to tell them? I'm barely coping with the OWL to RDF mapping, which is where OWLAPI stops and other standards pick up.

ignazio1977 commented 5 years ago

But, the main question, for which I wrote this mistaken test case, remains: is it correct for owl:Axioms to have a copy of b-node instead of the same b-node?

Sorry, no, I don't see.

So, you don't agree, just because no losing annotations in that testcase when IRIs are fixed, and OWL-API as usual understands correctly the document created by OWL-API itself.

No, "I don't see" means I have no evidence.

As for the axioms having copies of the data held in a node rather than references, the opinion is not mine. If the parsing rule is that each triple is matched only once in the parsing rules, then the shared structure leads to incomplete axioms; if we remove that rule, and therefore allow one triple to be matched by multiple rules, I don't know what the consequences are. I can only think of one, and that is that parsers will require more memory because they won't be able to discard triples once those have been matched to an axiom. I assume those who designed the mapping had their reasons for imposing the single match rule.

I'm more than happy to look at an RDF fragment that shows the problem - I think it would be better than using an example created with OWLAPI code, since that suffers from the exact problem you're pointing out.

ignazio1977 commented 5 years ago

And the rule that the same SPO must be as reified statement is not correct in OWL-API understanding of it, although there is no explicit specification about this strange behavior, and this definitely contradicts RDF understanding.

Sorry, I don't understand this sentence. SPO are reified to add annotations to axioms, that's not OWLAPI choice, it's the OWL to RDF mapping choice. Is this what you're referring to?

sszuev commented 5 years ago

Nope. Sorry, that's incorrect. Example: OWL written in functional syntax has no need of RDF. Other example: there's a mapping between OWL and RDF. If one is the same as the other, why is there a mapping?

Have a converse example: ONT-API, where OWL is RDF. In ONT-API this operation is not mapping, but rather reading. Similar, but different.

I'm barely coping with the OWL to RDF mapping

I am not. There is no any problems with RDF in ONT-API, Just because there OWL is RDF. By the way, Load-List-Save performance is also better for ONT-API. If OWL is not RDF, how it is possible ? And MS, FS, OWL/XML are also supported, if you have not noticed. How it is possible to support Functional Syntax in RDF-centric API, when the data is actually RDF?

I am afraid you will have always problems with the OWL to RDF mapping (and RDF to OWL), just because of the basic architecture principle. This results in a contentious cumbersome code with lots of repetitions and hacks. But it seems, it is too late to change anything here drastically. and there is already ONT-API. These all are rhetorical questions

And there is only one thing here, for which I started the bug: OWL-API and Protege must NOT produce wrong data And I see that data IS actually wrong. As for me it is not very important if OWL-API is loading incorrect data, but when it creates incorrect data - that is a biggie. This is a really serious problem.

But this is a digression. I do not want to convince you that OWL is RDF, as I said, for OWL-API it is not true, it is enough that for ONT-API it is true, including all benefits (triple stores, SPARQL, etc) which this approach brings.

Sorry, I don't understand this sentence.

Well, that's why there are chronic problems with annotations.

SPO are reified to add annotations to axioms, that's not OWLAPI choice, it's the OWL to RDF mapping choice. Is this what you're referring to?

The OWLAPI choice is to use SP(O), where SPO is required. Reification is the obvious and direct way to connect a statement (SPO) with other bulk of information (annotation). You have to preserve the reference, there are no other ways. When you do not follow that rule, but just suppress it with similar, when instead of SPO, there is a SP(O), then you violate data connectivity and specification. In the main example, there is a similar but different object, O* instead of O, a b-node with different id, but with the same structure. The root axiom SPO is actually not connected to the annotation in any way. For an abstract RDF engine, which does the work correctly according to the specification, that means problems. But the OWL-API works fine with such broken RDFs that are produced by the OWL-API itself.

if we remove that rule, and therefore allow one triple to be matched by multiple rules, I don't know what the consequences are

I hardly understand what do you mean, but RDF do not allow repetition, a SPO is unique within the document. And If you have triplet _:b0 rdfs:subClassOf _:b1, than you have it in the singular. Than, there is no problems when you refer to this triple from annotations. But if you replace SPO with SP(O), where O is a different b-node, then you have problems like these: #877, #879 And apparently a sequence of bugs, that hosted on the same violation, seems to be unlimited.

I'm more than happy to look at an RDF fragment that shows the problem - I think it would be better than using an example created with OWLAPI code, since that suffers from the exact problem you're pointing out.

Try to look the same code but for ONT-API. It will produce correct RDF, while OWL-API will not. And, by the way, feel free to use ONT-API as a test suite: I regularly find bugs in OWL-API after its release, so it would not hurt. There are many tests that were partially taken from the OWL-API contract, for which I must say thank you.

I am sorry, since I am not native speaker, sometimes it is difficult for me to maintain long discussions - sometimes I can't get the message across, and sometimes I don’t understand what you mean.

So, the same question one more time, yes or not: do you agree that OWL-API produces wrong RDF, replacing b-node with a copy of it?

ignazio1977 commented 5 years ago

I do not want to convince you that OWL is RDF, as I said, for OWL-API it is not true, it is enough that for ONT-API it is true, including all benefits (triple stores, SPARQL, etc) which this approach brings.

Great to hear, but the specs of OWL and those of RDF say different things and allow expression of different things. If, in your library, you wish to work differently, and that fits your use cases, that is great. If it's faster, that's even better news. But its relevance here is restricted to those parts of RDF and those parts of OWL that overlap.

sszuev commented 5 years ago

So, the same question one more time, yes or not: do you agree that OWL-API produces wrong RDF, replacing b-node with a copy of it?

ignazio1977 commented 5 years ago

Sorry, I don't understand this sentence.

Well, that's why there are chronic problems with annotations.

Sweeping statements only go so far.

SPO are reified to add annotations to axioms, that's not OWLAPI choice, it's the OWL to RDF mapping choice. Is this what you're referring to?

The OWLAPI choice is to use SP(O), where SPO is required. Reification is the obvious and direct way to connect a statement (SPO) with other bulk of information (annotation). You have to preserve the reference, there are no other ways. When you do not follow that rule, but just suppress it with similar, when instead of SPO, there is a SP(O), then you violate data connectivity and specification.

I have to ask for clarity.

Is this the same complaint reported in the other issue, where Turtle output has two [ ] where one shared _:genid should be used? If so, yes, bug, as already acknowledged.

All the rest is just entertaining to read.

Incidentally, what is SP(O*)?

ignazio1977 commented 5 years ago

So, the same question one more time, yes or not: do you agree that OWL-API produces wrong RDF, replacing b-node with a copy of it?

I thought we discussed this one to death.

sszuev commented 5 years ago

This is wrong:


[ rdf:type owl:Axiom ;
  owl:annotatedSource [ rdf:type owl:Restriction ;
                        owl:onProperty <d> ;
                        owl:allValuesFrom xsd:boolean ;
                        rdfs:subClassOf [ rdf:type owl:Restriction ;
                                          owl:onProperty <o> ;
                                          owl:someValuesFrom owl:Thing
                                        ]
                      ] ;
  owl:annotatedProperty rdfs:subClassOf ;
  owl:annotatedTarget [ rdf:type owl:Restriction ;
                        owl:onProperty <o> ;
                        owl:someValuesFrom owl:Thing
                      ] ;
  <X> "x" ;
  <Y> "y"
] .

This is correct:

_:b0    a                   owl:Restriction ;
        owl:onProperty      <o> ;
        owl:someValuesFrom  owl:Thing .
[ a                      owl:Axiom ;
  <X>                    "x" ;
  <Y>                    "y" ;
  owl:annotatedProperty  rdfs:subClassOf ;
  owl:annotatedSource    [ a                  owl:Restriction ;
                           rdfs:subClassOf    _:b0 ;
                           owl:allValuesFrom  xsd:boolean ;
                           owl:onProperty     <d>
                         ] ;
  owl:annotatedTarget    _:b0
] .
ignazio1977 commented 5 years ago

I am sorry, since I am not native speaker, sometimes it is difficult for me to maintain long discussions - sometimes I can't get the message across, and sometimes I don’t understand what you mean.

Neither am I, that's why I'd like examples instead of abstract, sweeping statements.

ignazio1977 commented 5 years ago

Your wrong example is not wrong according to the specs.

What negative consequences does that produce?

sszuev commented 5 years ago

Your wrong example is not wrong according to the specs.

A reference please.

Not this - https://www.w3.org/TR/2004/NOTE-owl-parsing-20040121/#subsec-structure-sharing - it is outdated, and nothing about annotations. And they talking about the cases which do not break RDF.

What negative consequences does that produce?

Protege produces wrong data that are spread over Internet. I am afraid, it is okay for you. For me - not.

For an abstract RDF engine, which does the work correctly according to the specification, that means problems.

OWL-API instead of SPO uses SP(O), where O is different b-node, not which required. If you do not preserve the reference the spec (https://www.w3.org/TR/owl2-mapping-to-rdf/#Translation_of_Annotations) makes no sense. Only having the same SPO it makes sense, only in this case.

Again: all these turbid spec (https://www.w3.org/TR/owl2-mapping-to-rdf/#Translation_of_Annotations) makes sense if and only if you preserve the reference to the statement for which you add annotations. Otherwise, there would not be any sense.

Is this the same complaint reported in the other issue, where Turtle output has two [ ] where one shared _:genid should be used? If so, yes, bug, as already acknowledged.

All these things about the same - misunderstanding of spec.

Incidentally, what is SP(O*)?

SPO - is a subject/predicate/object, a triple. SP(O*) - is a subject/predicate/{another object, which pretend to be O, but it is not, since it is different b-node with different blank node identifier}.

. If, in your library, you wish to work differently, and that fits your use cases, that is great. If it's faster, that's even better news.

I wish to work according to the specification. And I do. Also, I wish OWL-API works according to the specification.

But its relevance here is restricted to those parts of RDF and those parts of OWL that overlap.

The RDF data produced by any correct OWL API should not have overlaps with OWL, because it must be OWL.

ignazio1977 commented 5 years ago

What negative consequences does that produce?

Protege produces wrong data that are spread over Internet. I am afraid, it is okay for you. For me - not.

No. Let's not make assumptions about what is ok for other people.

You are saying starting with one expression and ending with two equivalent expressions is wrong.

But, in the OWL specs, class and property expressions cannot be referred from outside the axiom that defines them. In the OWL axioms, they must be repeated.

In RDF, these expressions are turned into blank nodes, and blank nodes have locally scoped identifiers, which means in the RDF file I /could/ reuse the expression. And I could write a parser to understand this and correctly parse from one physical occurrence of an expression to the two (or more) expressions in the actual OWL ontology. But, if I write it out the same way, I have to make assumptions as to which identical expressions come from a reused input and which are just axioms happening to use an equivalent expression. The OWL axioms don't have that information.

Ok, so let us suppose we want to reuse all identical expressions - efficient, right?

But. If I create my ontology with two identical expressions, for whatever reason, and they get synthesised into one - I can no longer modify these independently. I change one, the other will change as well. OWL does not support the same flexibility that RDF provides, in this instance. The workaround for reusing expressions is to name them - create an equivalent class/property axiom between a fresh IRI and the expression you wish to reuse, and use the IRI in place of the expression. The IRI fulfils the same role as the blank node identifier.

ignazio1977 commented 5 years ago

Also, I wish OWL-API works according to the specification.

We all do. But that's a lot more complex in practice than in theory, not to mention the fact that some aspects of the specs were not clear to the authors of the specs themselves. If they were surprised by the side effects of some of the requirements in there, it is unrealistic to expect perfect comprehension and perfect implementation. Or even that the specs are perfectly self consistent and work perfectly with each other (given that there are multiple specs involved in the conversation).

sszuev commented 5 years ago

That behavior even contradicts the spec https://www.w3.org/TR/owl-parsing/, which is outdated and does not concern annotations.

Having the axiom SubClassOf(Annotation(rdfs:comment "comm") ObjectComplementOf(<urn:test:A>) ObjectComplementOf(<urn:test:B>)) you are producing a repetition for the expression B, but not for A (there is a SP(O*), while should be either SPO or (S*)P(O*)). This issue https://github.com/owlcs/owlapi/issues/879 demonstrates the complete misunderstanding of any logic which underlies all these specs.

All these discussions are really annoying and upset. I have no more patient and motivation to continue. The demonstrative refusing understanding is not the thing that I can deal with. Since such misunderstandings are chronic in OWL-API, I am not going to continue with this particular issue and with the whole repo.

Close the issue.

ignazio1977 commented 5 years ago

On Mon, 2 Sep 2019 at 17:29, ssz notifications@github.com wrote:

So, the same question one more time, yes or not: do you agree that OWL-API produces wrong RDF, replacing b-node with a copy of it?

I thought we discussed this one to death.

  • Anonymous individuals: they should be stamped with an id and reused.

  • Class, property expressions: OWLAPI does not reuse those expressions, and you have not shown a use case where that's wrong, nor what are the negative consequences from doing so. OWLAPI is following the specs on this.

This is neither yes nor not. I need either yes or not (though, I am not sure yet what can be done in case of 'not'...).

You cannot get a binary answer to a tripartite question.

You know better than me that blank nodes in RDF have been used by the OWL specs in three ways:

As connectors for lists, neither of us has a concern.

As individuals, we are in agreement.

As class expressions, we disagree. You believe you got the specs right, and so do I. Bring in someone else to break the tie.

As for your theoretical considerations about the possibility of reusing b-nodes - this is not about this bug. And never was. This is called substitution of concepts. I stated the bug only to demonstrate the wrong understanding of the owl:Axiom/owl:Annotation specification. I used wrong example and I apologized for that. But you used it as an excuse to change the subject (to IRI validity).

I showed you that the bug in your example is not related to structure share but to relative IRI.

Then we concluded that was not the right example.

Then I made no other assumptions about his example.

After it, using my oftopic about suspicious element in outdated specification (https://www.w3.org/TR/owl-parsing/#subsec-structure-sharing ) you are again trying to change the subject.

That that is outdated is your opinion. OWL 1 and OWL 2 do not differ in this aspect of the specs.

Therefore, I have to emphasize this one more time: the bug is not about reusing b-nodes in general, it is about particular case when there is an operator TANN. No one disputes that different but equal expressions must be expressed as different b-nodes within the axiom and between axioms and ontologies. But it does not include axiom's annotations, otherwise there would be no connection between an axiom and its annotations in RDF. Only the correct reification will save here. But OWL-API does NOT produce the valid RDF in this case. That is the bug. (And please note - in ONT-API there is no problem with the statement about reusing b-nodes. It is just for completeness of understanding.)

And that we have agreed on on bug #877, which has annotations and individuals, and disagreed on #879, where we disagree on what axioms are represented there.

What negative consequences does that produce?

If you don't admit that this is specification violation, there cannot be any negative consequences within the OWL-API/Protege environment only. Just because storers write wrong RDF, believing that it is correct, and loaders read wrong RDF, again believing that it is correct.

Bring in someone else. We're stuck disagreeing and we're both sure we're reading the specs correctly.

There are negative consequences for other OWL tools and APIs, which follow the same specification (including ONT-API, which just leave wrong things unseen - and here, the axiom is appeared naked). Though, it is not enough for you.

Please re-read this comment #874 (comment) https://github.com/owlcs/owlapi/issues/874#issuecomment-527101466 one more time.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/owlcs/owlapi/issues/874?email_source=notifications&email_token=AAT2AJLF7U4QWFBVAUHMAVTQHU5N3A5CNFSM4IMOV6YKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WG3RI#issuecomment-527199685, or mute the thread https://github.com/notifications/unsubscribe-auth/AAT2AJIQ4IDHY6SCCKDCWF3QHU5N3ANCNFSM4IMOV6YA .

ignazio1977 commented 5 years ago

This issue #879 demonstrates the complete misunderstanding of any logic which underlies all these specs.

Let's try another approach.

Is the example in #879 a valid OWL ontology or not? A validation tool, not in the OWLAPI, would be a good way to decide.

If it's not a valid ontology (I believe that's what you said towards the end, because it has a missing reference?), then the bug is that the OWLAPI should throw an exception in strict mode parsing.

If it's a valid ontology then hopefully the validator also has a representation of what the axioms described are, and we can progress from there.

sszuev commented 5 years ago

If it's not a valid ontology (I believe that's what you said towards the end, because it has a missing reference?), then the bug is that the OWLAPI should throw an exception in strict mode parsing.

If it's a valid ontology then hopefully the validator also has a representation of what the axioms described are, and we can progress from there.

This ontology is invalid intentionally, this was just an attempt to demonstrate your where the wrong approach can lead, having an obvious case. But this case appeared not to be so obvious for you, as I expected. What you said there in the issue is wrong in my opinion, since the axiom is defined only by the main triple, not by others, including parts of owl:Axiom.
But it does not matter, since you correctly pointed out that this is not the case for OWL-API (but the case for ONT-API, which can read anything, taking only valid parts).

It was my mistake to create that issue. It just made things complicated. That ontology can be tested with a validation tool, but only after fixing - replace P2 => P1 (or vice verse). And the tool must not be from OWL-API or Protege family. Do you know such tool? Which reflects modern OWL2 with its annotations? I don't know.

If you don't mind, I am going to publish new arguments related to the official spec https://www.w3.org/TR/owl2-mapping-to-rdf. Unfortunately, I couldn’t clearly articulate these arguments, although I still do not admit that I am wrong. I already deleted the answer with them, because it was incorrect. The spec is really bad worded, although it does not mean it is incorrect.

sszuev commented 5 years ago

So, now lets try to play with the official single specification https://www.w3.org/TR/owl2-mapping-to-rdf/


1. Introduction

The spec defines only the operators T(E) and TANN(ann, y), where ann is Annotation( AP av ), E is object and y is b-node or object. The spec also says: The definition of the operator T uses the operator TANN in order to translate annotations. For the operations that are described in the section 2.1 Translation of Axioms without Annotations and the section 2.3 Translation of Axioms with Annotations there are no own names. The operator TANN is defined in Table 2, section 2.2 Translation of Annotations, but it is annotation for annotation, which is producing a b-node with root triple _:x rdf:type owl:Annotation. The operator that creates top-level annotations with the root triple _: x rdf: type owl: Axiom is also described in the specification, but also does not have a proper name. And, in sake of demonstration, I'm going to introduce a new name for this thing: ANN. Note: I am not writing my own spec, I am just trying to explain my vision using the new abbreviation. In general case this injection may not be correct, but for demonstration purposes, I think it is OK.

Also, let's consider the axiom SubClassOf as an operator with two operands. It is described in Table 1 in the section 2.1 Translation of Axioms without Annotations like this:

SubClassOf( CE1 CE2 )` = `T(CE1) rdfs:subClassOf T(CE2).

Let's also consider an overloaded operator SubClassOf with two operands and unlimited number of annotations. The SubClassOf( CE1 CE2 annotations { n > 1} ) is defined in the section 2.3.1 Axioms that Generate a Main Triple like the following:

s p xlt .
_:x rdf:type owl:Axiom .
_:x owl:annotatedSource s .
_:x owl:annotatedProperty p .
_:x owl:annotatedTarget xlt .
TANN(annotation1, _:x)
...
TANN(annotationm, _:x) 

For simplicity let's dwell on one case when there is only one top-level annotation. So that operator is SubClassOf( CE1, CE2, ann):

T(CE1) rdfs:subClassOf T(CE2) .
ANN(CE1, CE2, rdfs:subClassOf, ann) .

This is a new operator ANN, which is similar to TANN, but accepts two operands, annotation and constant, that defines the predicate. It produces the root triple _:x rdf:type owl:Axiom and all other triples are similar to the triples for the operator TANN in the example above.


2. An ontology without Annotations.

Now lets consider the example from the root of the issue where the first operand is DataAllValuesFrom and the second is ObjectSomeValuesFrom: SubClassOf( DataAllValuesFrom( <d> xsd:boolean ) ObjectSomeValuesFrom( <o> owl:Thing ) ).

In TURTLE it would look like this:

<d>     a       owl:DatatypeProperty .
<o>     a       owl:ObjectProperty .
[ rdf:type owl:Restriction ;
                        owl:onProperty <d> ;
                        owl:allValuesFrom xsd:boolean ;
                        rdfs:subClassOf [ rdf:type owl:Restriction ;
                                          owl:onProperty <o> ;
                                          owl:someValuesFrom owl:Thing
                                        ]
                      ] ;

In ONT-API such turtle can be generated by the following code:

OntGraphModel m = OntModelFactory.createModel().setNsPrefixes(OntModelFactory.STANDARD);
m.createDataAllValuesFrom(m.createDataProperty("d"), m.getDatatype(XSD.xboolean))
    .addSuperClass(m.createObjectSomeValuesFrom(m.createObjectProperty("o"), 
    m.getOWLThing()));
m.write(System.out, "ttl");

Or the same ontology in NTRIPLES syntax:

<d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#DatatypeProperty> .
<o> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
_:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 .
_:c1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:c1 <http://www.w3.org/2002/07/owl#onProperty> <d> .
_:c1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:c2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:c2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:c2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .

So, the main triple (s p xlt) here is _:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 . SubClassOf is an axiom that generate a main triple (see the section 2.3.1 Axioms that Generate a Main Triple). The s p xlt is _:c1 rdfs:subClassOf _:c2, where s (the subject, DataAllValuesFrom( <d> xsd:boolean )) is _:c1, p (predicate) is rdfs:subClassOf, and xlt (object - a blank node, an IRI, or a literal, ObjectSomeValuesFrom( <o> owl:Thing )) is _:c2.


3. Behaving of the operator T.

The spec says: In the mapping, each generated blank node (i.e., each blank node that does not correspond to an anonymous individual) is fresh in each application of a mapping rule.. I believe this is only about the operator T. This statement is roughly matched what is said in the Parsing OWL, Structure Sharing, OWL1 spec: In practice, this means that blank nodes (i.e. those with no name) which are produced during the transformation and represent arbitrary expressions in the abstract syntax form should not be "re-used".. In ordinary case it is not a problem neither for ONT-API nor OWL-API, all these things behave similarly. The following code produces identical RDF both for OWL-API (default impl) and ONT-API (in the last case the OWL-API interfaces are used):

OWLOntologyManager m = OntManagers.createONT();
OWLDataFactory df = m.getOWLDataFactory();
OWLClassExpression ce = df.getOWLObjectComplementOf(df.getOWLThing());
OWLOntology o = m.createOntology();            
o.add(df.getOWLSubClassOfAxiom(ce, ce));
o.saveOntology(OntFormat.TURTLE.createOwlFormat(), System.out);

For the two equal class expressions ObjectComplementOf( owl:Thing ) which are operands of SubClassOf( CE1, CE2 ) there would be two different b-nodes. So, no one disputes the fact that in OWL there is no objects sharing.
But, in my opinion, this must not be apply to the relationship between the axiom and its annotations, this is the case of the operator ANN, see the next paragraph.


4.1 An annotated axiom that generate a main triple. Reification with SPO.

Now let's add an annotation Annotation( rdfs:comment "comm" ) to the SubClassOf( DataAllValuesFrom( <d> xsd:boolean ) ObjectSomeValuesFrom( <o> owl:Thing ) ) (see previous paragraph 2) in a manner that I think is the only true. The operator ANN(CE1, CE2, rdfs:subClassOf, ann) would generate the following ttl:

s p xlt .
_:x rdf:type owl:Axiom .
_:x owl:annotatedSource s .
_:x owl:annotatedProperty p .
_:x owl:annotatedTarget xlt .
TANN(ann, _:x)

Here, the s p xlt is the result of applying the operator SubClassOf(CE1, CE2). From the Table 2, section 2.2 Translation of Annotations, the operator TANN(Annotation( AP av ), _:x) for Annotation( rdfs:comment "comm"^^xsd:string ) will give the triple _:x rdfs:comment "comm"^^xsd:string ., so we have:

s p xlt .
_:x rdf:type owl:Axiom .
_:x owl:annotatedSource s .
_:x owl:annotatedProperty p .
_:x owl:annotatedTarget xlt .
_:x rdfs:comment "comm"^^xsd:string .

The s p xlt here is _:c1 rdfs:subClassOf _:c2 . (see paragraph 2); and finally I get the following annotated axiom:

_:c1 rdfs:subClassOf _:c2 .
_:x rdfs:comment "comm"^^xsd:string .
_:x rdf:type owl:Axiom .
_:x owl:annotatedSource _:c1 .
_:x owl:annotatedProperty rdfs:subClassOf .
_:x owl:annotatedTarget _:c2 .

The full ontology (without ontology id) would be in NTRIPLES syntax:

<o> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
<d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#DatatypeProperty> .
_:x <http://www.w3.org/2000/01/rdf-schema#comment> "comm" .
_:x <http://www.w3.org/2002/07/owl#annotatedTarget> _:c2 .
_:x <http://www.w3.org/2002/07/owl#annotatedProperty> <http://www.w3.org/2000/01/rdf-schema#subClassOf> .
_:x <http://www.w3.org/2002/07/owl#annotatedSource> _:c1 .
_:x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Axiom> .
_:c2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:c2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:c2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 .
_:c1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:c1 <http://www.w3.org/2002/07/owl#onProperty> <d> .
_:c1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .

Or the same in TURTLE:

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

<o>     a       owl:ObjectProperty .

[ a                      owl:Axiom ;
  rdfs:comment           "comm" ;
  owl:annotatedProperty  rdfs:subClassOf ;
  owl:annotatedSource    [ a                  owl:Restriction ;
                           rdfs:subClassOf    _:c2 ;
                           owl:allValuesFrom  xsd:boolean ;
                           owl:onProperty     <d>
                         ] ;
  owl:annotatedTarget    _:c2
] .

<d>     a       owl:DatatypeProperty .

_:c2    a                   owl:Restriction ;
        owl:onProperty      <o> ;
        owl:someValuesFrom  owl:Thing .

The _:c1 rdfs:subClassOf _:c2 (SPO) is present in the graph and has its reification:

_:x owl:annotatedTarget _:c2 .
_:x owl:annotatedProperty rdfs:subClassOf .
_:x owl:annotatedSource _:c1 .

Note, that this ontology can be generated by the following code:

OntGraphModel m = OntModelFactory.createModel().setNsPrefixes(OntModelFactory.STANDARD);
m.createDataAllValuesFrom(m.createDataProperty("d"), m.getDatatype(XSD.xboolean))
        .addSubClassOfStatement(m.createObjectSomeValuesFrom(m.createObjectProperty("o"), m.getOWLThing()))
        .annotate(m.getRDFSComment(), "comm");
m.write(System.out, "ttl");
System.out.println(".......");
m.write(System.out, "nt");

4.2 An annotated axiom that generate a main triple. Reification with (S*)P(O*).

Well, the spec also says that In the mapping, each generated blank node (i.e., each blank node that does not correspond to an anonymous individual) is fresh in each application of a mapping rule. This is about the operator T, but not for the operators TANN, SubClassOf(CE1, CE2) or SubClassOf(CE1, CE2, ann). But SubClassOf operators consist of T and TANN, so they must also implicitly generate a blank node for each operands. I remind that the operator SubClassOf(CE1, CE2, ann) originally (see p.1) looks like following:

T(CE1) rdfs:subClassOf T(CE2) .
ANN(CE1, CE2, rdfs:subClassOf, ann) .

But it is still not fully clear what should actually happen with its part - the operator ANN(CE1, CE2, rdfs:subClassOf, ann). Let's take yours assumption, that the class expressions must not be shared even within a whole axiom including all its tree of annotations. This is definitely true for the operator SubClassOf(CE1, CE2), and wrong for the operator TANN, and the subject of controversy for the operator ANN. But for a sake of exercises let's assume that the rule also must be applicable to the ANN operands. So, the SubClassOf(CE1, CE2, ann) is now defined as the following:

T(CE1) rdfs:subClassOf T(CE2) .
ANN(T(CE1), T(CE2), rdfs:subClassOf, ann) .

or

SubClassOf(CE1, CE2) .
ANN(T(CE1), T(CE2), rdfs:subClassOf, ann) .

The SubClassOf(CE1, CE2) will give the following NTRIPLES (see p.2):

<d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#DatatypeProperty> .
<o> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
_:c2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:c2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:c2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 .
_:c1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:c1 <http://www.w3.org/2002/07/owl#onProperty> <d> .
_:c1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .

Here, the b-node _:c1 corresponds to the class expression DataAllValuesFrom( <d> xsd:boolean ), and the b-node _:c2 corresponds to the ObjectSomeValuesFrom( <o> owl:Thing ).

Then we do T in ANN for the subject (the first operand):

_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:b1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:b1 <http://www.w3.org/2002/07/owl#onProperty> <d> .

And for the object (the second operand):

_:b2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:b2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .

And ANN itself:

_:x <http://www.w3.org/2000/01/rdf-schema#comment> "comm" .
_:x <http://www.w3.org/2002/07/owl#annotatedTarget> _:b2 .
_:x <http://www.w3.org/2002/07/owl#annotatedProperty> <http://www.w3.org/2000/01/rdf-schema#subClassOf> .
_:x <http://www.w3.org/2002/07/owl#annotatedSource> _:b1 .
_:x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Axiom> .

Notice, that now we have fresh b-nodes for CE1 and CE2 (_:b1 and _:b2 - respectively), and have a reference in annotation (_:x) for these two nodes. Inside the annotation graph-structure there are _:b1, _:b2, not _:c1,_:c2, just because we first apply the operator T to the input class expression, and only then pass the result further into the operator ANN.

The full ontology would be as the following (just concatenate all parts above) (NTRIPLES):

<o> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
<d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#DatatypeProperty> .
_:c2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:c2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:c2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 .
_:c1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:c1 <http://www.w3.org/2002/07/owl#onProperty> <d> .
_:c1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:x <http://www.w3.org/2000/01/rdf-schema#comment> "comm" .
_:x <http://www.w3.org/2002/07/owl#annotatedTarget> _:b2 .
_:x <http://www.w3.org/2002/07/owl#annotatedProperty> <http://www.w3.org/2000/01/rdf-schema#subClassOf> .
_:x <http://www.w3.org/2002/07/owl#annotatedSource> _:b1 .
_:x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Axiom> .
_:b2 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:b2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:b1 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:b1 <http://www.w3.org/2002/07/owl#onProperty> <d> .

Or the same in TURTLE:

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

<o>     a       owl:ObjectProperty .

[ a                  owl:Restriction ;
  rdfs:subClassOf    [ a                   owl:Restriction ;
                       owl:onProperty      <o> ;
                       owl:someValuesFrom  owl:Thing
                     ] ;
  owl:allValuesFrom  xsd:boolean ;
  owl:onProperty     <d>
] .

<d>     a       owl:DatatypeProperty .

[ a                      owl:Axiom ;
  rdfs:comment           "comm" ;
  owl:annotatedProperty  rdfs:subClassOf ;
  owl:annotatedSource    [ a                  owl:Restriction ;
                           owl:allValuesFrom  xsd:boolean ;
                           owl:onProperty     <d>
                         ] ;
  owl:annotatedTarget    [ a                   owl:Restriction ;
                           owl:onProperty      <o> ;
                           owl:someValuesFrom  owl:Thing
                         ]
] .

As you can see, the _:c1 rdfs:subClassOf _:c2 (SPO) is present in the graph, but has no reification. Instead, there is a reification for the triple _:b1 rdfs:subClassOf _:b2 ((S*)P(O*)), which does not exist in the graph:

_:x owl:annotatedTarget _:b2 .
_:x owl:annotatedProperty rdfs:subClassOf .
_:x owl:annotatedSource _:b1 .

Since the triple _:b1 rdfs:subClassOf _:b2 does not exist, in my opinion, this example exercise (the paragraph name is 4.2) demonstrates invalid behavior.


*4.3 An annotated axiom that generate a main triple by OWL-API. Reification with `SP(O)`.**

But what OWL-API does? The code to generate:

OWLOntologyManager man = OntManagers.createOWL();
OWLDataFactory df = man.getOWLDataFactory();
OWLAxiom a = df.getOWLSubClassOfAxiom(df.getOWLDataSomeValuesFrom(df.getOWLDataProperty("d"),
        df.getBooleanOWLDatatype()),
        df.getOWLObjectAllValuesFrom(df.getOWLObjectProperty("o"), df.getOWLThing()),
        Collections.singletonList(df.getRDFSComment("comm")));
OWLOntology o = man.createOntology();
o.add(a);
o.saveOntology(new TurtleDocumentFormat(), System.out);

NTRIPLES (note: the Ontology ID is excluded):

<o> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#ObjectProperty> .
<d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#DatatypeProperty> .
_:u <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:u <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:u <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:x <http://www.w3.org/2000/01/rdf-schema#comment> "comm" .
_:x <http://www.w3.org/2002/07/owl#annotatedTarget> _:u .
_:x <http://www.w3.org/2002/07/owl#annotatedProperty> <http://www.w3.org/2000/01/rdf-schema#subClassOf> .
_:x <http://www.w3.org/2002/07/owl#annotatedSource> _:c1 .
_:x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Axiom> .
_:c1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> _:c2 .
_:c1 <http://www.w3.org/2002/07/owl#someValuesFrom> <http://www.w3.org/2001/XMLSchema#boolean> .
_:c1 <http://www.w3.org/2002/07/owl#onProperty> <d> .
_:c1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .
_:c2 <http://www.w3.org/2002/07/owl#allValuesFrom> <http://www.w3.org/2002/07/owl#Thing> .
_:c2 <http://www.w3.org/2002/07/owl#onProperty> <o> .
_:c2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Restriction> .

The original TURTLE:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@base <http://www.w3.org/2002/07/owl#> .

[ rdf:type owl:Ontology
 ] .

#################################################################
#    Object Properties
#################################################################

###  o
<o> rdf:type owl:ObjectProperty .

#################################################################
#    Data properties
#################################################################

###  d
<d> rdf:type owl:DatatypeProperty .

#################################################################
#    General axioms
#################################################################

[ rdf:type owl:Axiom ;
  owl:annotatedSource [ rdf:type owl:Restriction ;
                        owl:onProperty <d> ;
                        owl:someValuesFrom xsd:boolean ;
                        rdfs:subClassOf [ rdf:type owl:Restriction ;
                                          owl:onProperty <o> ;
                                          owl:allValuesFrom owl:Thing
                                        ]
                      ] ;
  owl:annotatedProperty rdfs:subClassOf ;
  owl:annotatedTarget [ rdf:type owl:Restriction ;
                        owl:onProperty <o> ;
                        owl:allValuesFrom owl:Thing
                      ] ;
  rdfs:comment "comm"
] .

###  Generated by the OWL API (version 5.1.11) https://github.com/owlcs/owlapi/

Reformatted TURTLE (note: the Ontology ID is excluded):

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .

<o>     a       owl:ObjectProperty .

<d>     a       owl:DatatypeProperty .

[ a                      owl:Axiom ;
  rdfs:comment           "comm" ;
  owl:annotatedProperty  rdfs:subClassOf ;
  owl:annotatedSource    [ a                   owl:Restriction ;
                           rdfs:subClassOf     [ a                  owl:Restriction ;
                                                 owl:allValuesFrom  owl:Thing ;
                                                 owl:onProperty     <o>
                                               ] ;
                           owl:onProperty      <d> ;
                           owl:someValuesFrom  xsd:boolean
                         ] ;
  owl:annotatedTarget    [ a                  owl:Restriction ;
                           owl:allValuesFrom  owl:Thing ;
                           owl:onProperty     <o>
                         ]
] .

As you can see, the triple _:c1 rdfs:subClassOf _:c2 (SPO) is present in the graph, but has no reification, just like in the previous paragraph (p4.2) Instead, there is an reification for the triple _:c1 rdfs:subClassOf _:u (SP(O*)), which does not exist in the graph:

_:x owl:annotatedTarget _:u .
_:x owl:annotatedProperty rdfs:subClassOf .
_:x owl:annotatedSource _:c1 .

Since the triple _:c1 rdfs:subClassOf _:u does not exist in the whole graph, this example exercise (the paragraph name is 4.3) also demonstrates wrong behavior. So, in my opinion OWL-API does not produce correct RDF in the case an annotated axiom consists of anonymous expressions.

sszuev commented 5 years ago

Side notes:

ignazio1977 commented 5 years ago

On Tue, 3 Sep 2019, 12:29 ssz, notifications@github.com wrote:

Side notes:

  • Testing using external tool is a great idea, unfortunatelly I don't know what tool can be choosen. Obviously, it must not be based on OWL-API, but at the same time it must support OWL2 with its annotations. Do you know such tool?

I don't know off the top of my head if there's a good tool to do this. The best approximation I can think of is the set of test cases for parsers on the W3C pages, but I don't know if any of those matches the pattern you used.

It's tricky to start with known bad data. The only correct response is to throw an exception; to interpret it and select only the understandable parts means that a complaint can always be raised about the assumptions made.

Owlapi does that (in default mode) and we do get complaints. The reason to do it is to help developers who want to work with existing bad data that they cannot change, or with non OWL data (skos is an example).

It is a compromise, no doubt. But it's the best we could do.

ignazio1977 commented 5 years ago

On Tue, 3 Sep 2019, 12:29 ssz, notifications@github.com wrote:

Side notes:

If I understand your project correctly, you could choose to leave the owlapi parsers out, or only use the Rio adapters. That way, you don't have to deal with the choices we made in parsing and saving.

It's also possible to build a replacement bundle with different parsers and propose it to protégé for adoption. As long as the format classes are the same, protégé could use it with no changes on their side.

sszuev commented 5 years ago

If I understand your project correctly, you could choose to leave the owlapi parsers out, or only use the Rio adapters. That way, you don't have to deal with the choices we made in parsing and saving. It's also possible to build a replacement bundle with different parsers and propose it to protégé for adoption. As long as the format classes are the same, protégé could use it with no changes on their side.

I said it in a general sense. Indeed, this particular bug has no impact neither ONT-API (the priority for Jena-riot, OWL-API parsers are optional) nor our protege-like system that used it. But there is indirect impact: such an axiom with annotation created by the official protege will be seen naked, without that wrong annotation. It can be easily handled by internal transformers, although I am not sure I will do it. Anyway, the official Protege must not produce wrong RDF. Then, I have to ask again: do you agree with this comment ? If not, 1)why and 2)how about stackoverflow?

ignazio1977 commented 5 years ago

On Tue, 3 Sep 2019 at 15:58, ssz notifications@github.com wrote:

If I understand your project correctly, you could choose to leave the owlapi parsers out, or only use the Rio adapters. That way, you don't have to deal with the choices we made in parsing and saving. It's also possible to build a replacement bundle with different parsers and propose it to protégé for adoption. As long as the format classes are the same, protégé could use it with no changes on their side.

I said it in a general sense. Indeed, this particular bug has no impact neither ONT-API (the priority for Jena-riot, OWL-API parsers are optional) nor our protege-like system that used it. But there is indirect impact: such an axiom with annotation created by the official protege will be seen naked, without that wrong annotation. It can be easily handled by internal transformers, although I will do it. Anyway, the official Protege must not produce wrong RDF.

What I'm saying is that there are ways to get to your objective even if we do not reach an agreement on what needs to change.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ignazio1977 commented 5 years ago

https://github.com/owlcs/owlapi/issues/874#issuecomment-527399645 is pretty long so I can't answer inline, it would become too confused.

Hoping to synthesize - given that the issue is the RDF output, let me start with functional syntax as it's shorter and closest to the abstract syntax for OWL, so it's easier to write down.

As you remark, in the mapping rules each rule creates fresh _:x values and each annotation reuses that _:x value that is passed as input to it - this covers blank nodes that do not exist as such in the OWL abstract model, because in that model the only things with a blank node id are anonymous individuals. So, anonymous individual ids in the rules behave like constants, and the same individual referred in two axioms maintains the same id. Same is true, trivially, when the individual needs to appear twice in the output because of reification for annotations.

So, some simple axioms, where a is an anonymous individual - I'm sticking annotations in without much attention to the actual syntax, and CEx are class expressions where same number means the two class expressions are .equal() each other (no matter if they're two copies of the same data or two uses of the same object):

ClassAssertion( CE1 a )
ClassAssertion( CE2 a )
ClassAssertion( CE3 a, annotation(comment "test for anon class in object position"))
ObjectPropertyAssertion( OP a1 a2, annotation(comment "test for anon individual in object position"))
ClassAssertion( CE1 b )
ClassAssertion( CE2 b )

a will become _:x1 and be referred as such everywhere in the RDF. b will become _:x2 and again stay the same; same for a1 and a2. Correct?

Note: #877 amends the behaviour here, because a1 and a2 were translated correctly to _:genid4 and _:genid5 in that example, but it is necessary to write out the nodes twice when the axiom is annotated. That was missing from the code - when the node must be written more than once, the id must be written out, but the code was checking only individuals appearing in more than one axiom, not individuals appearing in annotated axioms. So, my example doesn't work in 5.1.11; will work in the next release.

Now, about the class expressions.

Let's say CE1 is the same object used twice. From the specs, I understand that CE1 will be translated twice, to two fresh identifiers. However, in this example they only appear in non annotated axioms, and therefore they are outputted in the final Turtle (or RDF/XML, actually) once. Since the node is only outputted once, it does not need its id to be explicit. So, Turtle looks something like

_:x1 rdf:type [ whatever the triples required for CE1]
_:x2 rdf:type [ whatever the triples required for CE1]

_:x1 rdf:type [ whatever the triples required for CE2]
_:x2 rdf:type [ whatever the triples required for CE2]

Are we in agreement about this? To be explicit, this means that in the OWL abstract axioms there are four expressions, even if they were only two objects in the OWLOntology. There is no semantic difference between having two or four objects, in the parsed model, except for the memory used.

CE3 here behaves differently because there are annotations. It needs a fresh id, and given the structure of the axiom the node needs to appear twice, so the id must be made explicit. So there will be an _:x3 value appearing in there, with the triples necessary for the expression attached, and used in the base triple (if one exists for the axiom, as is the case here) and in the reification for the annotation. Correct?

To make things interesting, let's add three axioms to the ontology:

ClassAssertion( CE3 a)
ClassAssertion( C4 a)
ClassAssertion( C4 a, annotation(comment "test for named class in object position"))

C4 is a named class, so there is no anonymous node for it.

The semantic value of the annotated and non annotated axioms is the same, but they do not equal each other, as per both the mapping and the abstract syntax.

But, tricky behaviour here: translating ClassAssertion( CE3 a) produces a new id for CE3 because it's a new application of the rules, and so generates a new set of triples: _:x rdf:type [more triples as above]

Translating the two axioms involving C4 follows the same behaviour as before, but with one exception: the first axiom is translated to just one triple _:x1 rdf:type C4, the second is translated to its base triple _:x1 rdf:type C4 plus the reification triples for the annotation. Thankfully, there's an IRI here so there's nothing to worry about node ids.

So what we have is two distinct axioms creating the same triple _:x1 rdf:type C4. We can't have the same triple twice - when parsing at the RDF level, we'll lose one.

What this means is that we just found an ontology which cannot be written to RDF and parsed back in a form identical to the original ontology. As in section 3 of the document, the maximal possible subset of G must be matched. We only have two possibilities here: either the annotated axiom 'steals' the base triple and the unannotated axiom disappears completely (roundtripping failure) or the unannotated axiom steals the base triple and the annotated axiom cannot be parsed fully, because its pattern is missing the base triple (and so it should not be matched, the reified triples should be left unparsed, and the parsing should fail because one of the conditions for successful parsing is that all triples must be consumed).

So, when I said 'all OWL ontologies can be written to RDF', that statement needs to be qualified with 'but sometimes reading the ontology back from RDF won't give you the same ontology, annotation wise'. There's a similar scenario for EquivalentClasses and EquivalentProperties axioms with more than two members and annotations, but let's put that aside for the moment.

Is this correct?

I haven't written any code yet to check what OWLAPI does in these scenarios, because I want agreement on the expected results before adding any output to the picture - so I don't know yet whether I've described test cases for new bugs or test cases for correct code. Does this sound sensible?

ignazio1977 commented 5 years ago

874 (comment) is pretty long so I can't answer inline, it would become too confused.

Ironically, I've written another monster comment trying to synthesize the issues. Eh. It's fun, what am I going to do about it.

ignazio1977 commented 5 years ago

Fixed under #881