RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
246 stars 63 forks source link

ConstraintLoadError: sh:namespace value must be an RDF Literal with type xsd:anyURI. #61

Closed James-Hudson3010 closed 4 years ago

James-Hudson3010 commented 4 years ago

This may be related to the changes made for https://github.com/RDFLib/pySHACL/issues/59

Using the script below and the SHACL from http://datashapes.org/schema.ttl, I get the following error:

ConstraintLoadError: sh:namespace value must be an RDF Literal with type xsd:anyURI. https://www.w3.org/TR/shacl/#sparql-prefixes

However, running pyshacl from the command line, appears to work correctly.

pyshacl -s ./schema_org_validation.ttl ./test_data.ttl

Validation Report
Conforms: False
Results (1):
Constraint Violation in ClassConstraintComponent (http://www.w3.org/ns/shacl#ClassConstraintComponent):
    Severity: sh:Violation
    Source Shape: schema:CommunicateAction-about
    Focus Node: ex:asdgjkj
    Value Node: [ rdf:type sch:GameServer ; sch:playersOnline Literal("42", datatype=xsd:integer) ]
    Result Path: schema:about
    Message: Value does not have class schema:Thing

(I am not include the schema.org schema, hence the validation error)

Python script:

Archive.zip

import rdflib
from pyshacl import validate

data = """
@prefix ex: <http://example.org/> .
@prefix sch: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:asdgjkj a sch:CommunicateAction ;
    sch:about [ a sch:GameServer ;
            sch:playersOnline "42"^^xsd:integer ] .
"""

dataGraph = rdflib.Graph().parse( data = data, format = 'ttl' )
print( dataGraph.serialize( format='ttl' ).decode( 'utf8' ) )

shaclData = open( "./schema_org_validation.ttl", "r" ).read()
shaclGraph = rdflib.Graph().parse( data = shaclData, format = 'ttl' )

report = validate( dataGraph, shacl_graph = shaclGraph, abort_on_error = False, meta_shacl = False, debug = False, advanced = True, do_owl_imports = True )

print( report[2] )
ashleysommer commented 4 years ago

Ok, after some digging it seems this error is actually coming from the SHACL ontology file here: https://www.w3.org/ns/shacl.ttl

Normally that file is used in concept.. it is listed as a prefix at the top of every SHACL shape file, but its normally not imported.

However in this case, 1) Schema.org Shapes file owl:imports DASH shapes file 2) DASH Shapes file owl:imports shacl.ttl 3) shacl.ttl (on line 15) creates a sh:declare node, with sh:namespace being a string literal, not an xsd:anyURI 4) DASH Shapes file owl:imports TOSH Shapes file 5) TOSH Shapes file declares a SPARQLFunction that references prefixes, including the broken sh prefix declaration from shacl.ttl

So its definitely a bug in the shacl.ttl file I went to report the issue, but someone else actually recently found the same error: https://github.com/w3c/data-shapes/issues/125

Looks like the bug is not able to be fixed in shacl.ttl for historical reasons, so I will have to put a workaround in PySHACL.

ashleysommer commented 4 years ago

@James-Hudson3010 A fix for this is in PySHACL v0.13.2

James-Hudson3010 commented 4 years ago

Confirmed. (I am curious why it seemed to work with the pySHACL CLI tool)

ashleysommer commented 4 years ago

Its because you didn't enable advanced mode, or owl imports on the cli invocation. The error is introduced from the SHACL Ontology file, which is imported from the DASH Shapes file, which is only imported if you have owl imports enabled. (--imports in the cli tool). Secondly, the error is triggered by a SPARQLFunction, those are only used when Advanced Mode is enabled (-a in the cli tool).

So it would've shown the same error if you executed the CLI tool like this:

pyshacl -s ./schema_org_validation.ttl -a --imports ./test_data.ttl