w3c / data-shapes

RDF Data Shapes WG repo
87 stars 33 forks source link

SHACL vocabulary is inconsistent #125

Open wouterbeek opened 3 years ago

wouterbeek commented 3 years ago

The SHACL vocabulary may be inconsistent, since it states that namespaces must be URIs:

sh:namespace rdfs:range xsd:anyURI.

but at the same times defines its own namespace with a string:

sh:
  sh:declare
    [ sh:namespace "http://www.w3.org/ns/shacl#" ].

The easiest fix here is probably to change the above literal to:

"http://www.w3.org/ns/shacl#"^^xsd:anyURI
HolgerKnublauch commented 3 years ago

Yes I agree this is a glitch. You have already outlined the fix. I don't think I am permitted to simply change that file because we are outside of a W3C Working Group. I guess once a SHACL 1.1 WG is created this would be an item to fix. The current Errata already links to this GitHub issues list, so it will be remembered until then.

https://www.w3.org/2017/shacl/errata

Meanwhile the issue would only be reported as an inconsistency if someone would define a SHACL constraint on sh:namespace. rdfs:range by itself doesn't do any "harm".

This ticket will remain open.

ashleysommer commented 3 years ago

@HolgerKnublauch This issue just appeared to me in the form of a bug reported on PySHACL: https://github.com/RDFLib/pySHACL/issues/61

PySHACL implements the sh:declare feature with the rules defined in https://www.w3.org/TR/shacl/#sparql-prefixes

It states:

The values of sh:namespace are literals of datatype xsd:anyURI.

PySHACL has logic to throw an error when loading a SHACL Constraint, if that constraint has sh:prefixes that are invalid. And perhaps heavy-handedly pySHACL considers a sh:namespace without type xsd:anyURI to be invalid.

This bug has been around since 2017, but it hasn't been a problem because normally shacl.ttl isn't imported into the SHACL shapes graph.

It has come up in this case because: 1) Schema.org Shapes file owl:imports DASH shapes file 2) DASH Shapes file owl:imports shacl.ttl 3) shacl.ttl creates a sh:declare node, with sh:namespace being a string literal, not an xsd:anyURI 4) DASH Shapes file owl:imports TOSH Shapes file 5) TOSH Shapes file declares a SHACLFunction that references prefixes, including the broken sh prefix declaration from shacl.ttl

My question is.. what do you suggest the most logical solution for pySHACL? 1) Allow string literals without type xsd:anyURI on sh:namespace? (probably the easiest solution) 2) Ignore any sh:declare with prefix sh (because prefix sh: <https://www.w3.org/ns/shacl#> is always present when running SPARQL in PySHACL anyway). 3) When owl:imports https://www.w3.org/ns/shacl silently replace it with a baked-in errata version of the file. (saves on HTTP requests too) 4) Don't allow owl:imports of https://www.w3.org/ns/shacl# from a SHACL Shapes file (probably too strict!)

HolgerKnublauch commented 3 years ago

We have probably made a couple of mistakes in being too strict with datatypes for SHACL itself. For example, many people use xsd:integer values for sh:order although the spec only allows xsd:decimal. I think the proper fix would be to relax the spec in a version 1.1 to allow sh:or. So for sh:namespace you could in principle relax that constraint to allow xsd:string too. You can of course also hard-code an exception for the sh: namespace. In the past I have also redirected owl:imports to local copies, e.g. you can prevent the buggy graph from being downloaded and instead always use your own copy.

akuckartz commented 3 years ago

This section in the W3C Process specification seems to be relevant: https://www.w3.org/2019/Process-20190301/#revised-rec

ashleysommer commented 3 years ago

@HolgerKnublauch Ok, I've gone with a combination of some of those ideas. 1) if the sh:declare has sh:prefix value of "sh", silently allow string literal for sh:namespace 2) in other cases, still allow string literals for sh:namespace but issue a warning to console 3) bake in a local errata copy of shacl.ttl to use for owl:imports, I was planning to do this anyway