w3c / data-shapes

RDF Data Shapes WG repo
87 stars 33 forks source link

Proposal for removing the domain assertion for 'sh:declare' #131

Closed wouterbeek closed 3 years ago

wouterbeek commented 3 years ago

I'm proposing to remove the following triple from the SHACL vocabulary:

sh:declare rdfs:domain owl:Ontology.

While there is a good use case for asserting prefix declarations for OWL ontologies, there are also good use cases for asserting prefix declarations for things that are not OWL ontologies. Examples include (SPARQL) queries, (SPARQL) endpoints, and datasets that are not OWL ontologies.

The removal of this triple will also make the SHACL vocabulary easier to understand and apply in light of the following paragraph from the SHACL standard document, which explicitly allows sh:declare to be asserted for resources that are not OWL ontologies:

The recommended subject for values of sh:declare is the IRI of the named graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this is not required.

HolgerKnublauch commented 3 years ago

To summarize from https://github.com/w3c/data-shapes/issues/130, the main reasons for having sh:declare rdfs:domain owl:Ontology were

The main reason against this domain statement seems to be that it may cause an undesired inference of X rdf:type owl:Ontology if X is not (meant to be) an owl:Ontology.

I have no strong opinion yet it seems the advantages outweigh the disadvantage. What specific problems does the inference of rdf:type owl:Ontology cause to anyone?

wouterbeek commented 3 years ago

What specific problems does the inference of rdf:type owl:Ontology cause to anyone?

One specific problem:

prefix sd: <http://www.w3.org/ns/sparql-service-description#>

[ a sd:SPARQL11Query;
  sh:declare [ sh:prefix "..."; sh:namespace "..." ] ].

^ After reading the SHACL documentation I thought that this was ok. Later I discover, to my surprise, that I am actually asserting that my queries are ontologies.

The main reason against this domain statement seems to be that it may cause an undesired inference of X rdf:type owl:Ontology if X is not (meant to be) an owl:Ontology.

I disagree with this, but also believe this is a rehash of #130. I have responded to this sentence there, since that is a much broader discussion. I have done so to keep this present issue small in scope / only about sh:declare.

jaw111 commented 3 years ago

I suggest to use the weaker semantics of http://schema.org/domainIncludes in the definition of sh:declare:

sh:declare <http://schema.org/domainIncludes> owl:Ontology.

That should give the hint to editors that the sh:declare property MAY be used on a resource of type owl:Ontology without implying/entailing that any resource with sh:declare is of type owl:Ontology.

pmaria commented 3 years ago

In light of

These IRIs are often declared as an instance of owl:Ontology, but this is not required.

I agree that it would be best to remove the rdfs:domain statement and replace it with another hint, using a Property Shape or schema:domainIncludes.

white-gecko commented 3 years ago

I think a property shape would be better so that tools that already deal with SHACL don't need to add additional semantics. A suggestion: https://github.com/w3c/data-shapes/issues/130#issuecomment-760233786

jaw111 commented 3 years ago

That's harder than you might think. AFAIK there needs to be a NodeShape from which it hangs to do any validation. So you need something like:

ex:declareShape a sh:NodeShape ;
  sh:targetSubjectsOf sh:declare ;
  sh:property [
    sh:path rdf:type ;
    sh:hasValue owl:Ontology
  ] .

However this is still not a hint as it requires any resource that has sh:declare property to be a owl:Onotology.

Maybe there is another incantation that has the desired 'hint' effect.

TallTed commented 3 years ago

@wouterbeek -- It seems that you are asking that the SHACL ontology be revised because you misunderstood it.

Rather, it seems that SHACL has done it's job if you processed your instance data with it, which you had intended to conform to the SHACL ontology, and found your misunderstanding was revealed as an issue in your data.

I would suggest that for the long term, you change your use of sh:declare.

For the immediate term, you might take a copy of the SHACL ontology and edit it for your own local use, changing your local instance of --

sh:declare rdfs:domain owl:Ontology .

-- to --

sh:declare schema:domainIncludes owl:Ontology .

-- which allows for an entity of any rdf:type to be the subject of an sh:declare triple, because in the Open World of RDF, anything unstated is unknown -- so while sh:declare schema:domainIncludes owl:Ontology tells us that some subjects of an sh:declare triple will be instances of the owl:Ontology class, it does not tell us anything else about other sh:declare subjects.


It is perhaps worth noting that schema.org substantially revised their own ontology in recent times, replacing many instances of rdfs:range and rdfs:domain with schema:rangeIncludes and schema:domainIncludes, respectively, because that ontology had evolved over time to include multiple rdfs:range statements which values were disjoint. This was problematic, because the defined semantics of rdfs:range and rdfs:domain are that when they have multiple values, all of those values must be true, not that any of them should be true.

We might consider making changes along similar lines to the SHACL ontology, if there were a volume of requests with somewhat better justification than "we misunderstood it."

TallTed commented 3 years ago

One other thing.

The recommended subject for values of sh:declare is the IRI of the named graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this is not required.

The thing that is not required is explicit declaration that the IRIs are instances of owl:Ontology -- i.e., there is no requirement for :myontology rdf:type owl:Ontology -- because this rdf:type may be inferred based on these IRIs being the subjects for values of sh:declare.

That text does not "explicitly allow sh:declare to be asserted for resources that are not OWL ontologies". This is a misinterpretation of the text.

It might be sufficiently clarified by adding a couple words, as --

The recommended subject for values of sh:declare is the IRI of the named graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this rdf:type declaration is not required.

We have somewhat more leeway to add a clarifying note to the spec, to this effect, than we do to change the ontology. It's still a non-trivial change to make happen, but it may be worthwhile.

azaroth42 commented 3 years ago

As a W3C point of process, I believe that the ontology managed in /ns/ is not normative, and can be changed very easily. @iherman can you clarify?

The text in the specification would then not need changing, as not all subjects of sh:declare would be inferred to be owl:Ontologies ... just like the text says.

wouterbeek commented 3 years ago

That text does not "explicitly allow sh:declare to be asserted for resources that are not OWL ontologies". This is a misinterpretation of the text.

@TallTed Thanks for this clarification! I indeed misinterpreted the text :-/ This means that sh:declare is specified correctly in the SHACL vocabulary.

Authors who want to assert predicate declarations for things that are not OWL ontologies should not use sh:declare but roll their own or use tp:prefixDeclaration.

pfps commented 3 years ago

But it still remains that as far as the SHACL ontology is concerned the denotation of subject of triples with predicate sh:declare are OWL ontologies (unless SHACL is supposed to be violating the W3C Semantic Web standards).

TallTed commented 3 years ago

@pfps --

But it still remains that as far as the SHACL ontology is concerned the denotation of subject of triples with predicate sh:declare are OWL ontologies (unless SHACL is supposed to be violating the W3C Semantic Web standards).

Yes. Is that a problem from your perspective?

jaw111 commented 3 years ago

@wouterbeek a more generic mechanism to specify prefixes for things which are not ontologies (such as SPARQL endpoints, queries or datasets) seems useful and worthwhile. Perhaps it's an idea to propose an extension to the SPARQL service description vocabulary to subsume those terms from your vocabulary as part of the SPARQL 1.2 effort?

pfps commented 3 years ago

@TallTed It appears that some posts in these threads do not believe this follows from the SHACL ontology or that it has significant counterexamples. I haven't looked into the second but if true the triple needs to be removed from the SHACL ontology.

wouterbeek commented 3 years ago

@jaw111 Good idea! Can you review it? It's over at https://github.com/w3c/sparql-12/issues/134.

(You may be aware of more prior work, or correct my English :-P)

iherman commented 3 years ago

As a W3C point of process, I believe that the ontology managed in /ns/ is not normative, and can be changed very easily. @iherman can you clarify?

I was not involved in the SHACL WG, I do not know who maintains the specification itself. But, without knowing the details, indeed, the ontology file is not normative. The content of https://www.w3.org/ns/shacl-shacl is included in the specification, but there is a note whereby the /ns version may be updated and therefore the text in the spec could become out of sync.

Any such change would require some sort of a consensus of the community, obviously.

TallTed commented 3 years ago

@azaroth42

The text in the specification would then not need changing, as not all subjects of sh:declare would be inferred to be owl:Ontologies ... just like the text says.

But that's not what the text says.

The text is (emphasis added):

The recommended subject for values of sh:declare is the IRI of the named graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this is not required.

The second sentence is the focus.

That sentence says that the IRIs which are subjects of sh:declare need not be explicitly declared as instances of owl:Ontology.

The reason this need not be explicitly declared is that they are instances of owl:Ontology.

This may also be deduced through simple rdfs inference including the SHACL ontology.

Errors have been made, by at least one human who misunderstood that sentence, in the construction of their own instance data. That is unfortunate, but correctable, with no impact on anyone else who has used or included the SHACL ontology in their own efforts.

I am struggling to understand why changes to the SHACL ontology, which has been shown not to contain an error, continue to be discussed.

I believe this issue (#131) should be closed without other action, for the same reason as its precursor which went very far into the weeds (#130) already has:

The prose of the SHACL specification was simply misunderstood. Once that misunderstanding was addressed, the triple in the SHACL ontology which had previously been considered to be an error was shown to be in fact correct.

As I noted earlier, a couple of words could be added to that sentence of the spec to clarify the intended meaning, but (like the rdf:type declarations of which that sentence speaks) this is not and should not be required.

azaroth42 commented 3 years ago

Okay, then multiple people misunderstood that sentence because I read it exactly the same way as @wouterbeek. An editorial errata to make the domain of sh:declare more obvious would be appreciated :)

HolgerKnublauch commented 3 years ago

Having thought about this, I now prefer to also delete the rdfs:domain triple. I want (and always wanted) to make this part of the SHACL-SPARQL vocabulary as applicable as possible. If the rdfs:domain is seen as an obstacle to adoption then by all means let's delete it. I thought the benefits of it were outweighing the disadvantages, because I didn't see harm in the extra inference, but it seems to bother people. For those who want this triple, it is easy to add back, however in RDF it is very difficult to delete a triple from a graph that is not under one's own control.

TallTed commented 3 years ago

I would prefer to change the predicate in the triple from rdfs:domain to schema:domainIncludes than to delete the triple entirely, but I won't block consensus.

HolgerKnublauch commented 3 years ago

This is fixed by this merge, as discussed and circulated. I will contact the W3C staff to see how this file can go live on its URL.

https://github.com/w3c/data-shapes/commit/6f9248970f4081c23e99d4d599b441d690f5403a