RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
242 stars 63 forks source link

Validation does not work for classes that are also node shapes #38

Closed mgberg closed 4 years ago

mgberg commented 4 years ago

If I run the following code:

shapes = rdf.Graph()
shapes.parse(data="""
    @prefix sh: <http://www.w3.org/ns/shacl#> .
    @prefix owl: <http://www.w3.org/2002/07/owl#> .
    @prefix ex: <http://example.org/ns#> .

    ex:Person
          a owl:Class ;
          a sh:NodeShape ;
          sh:property ex:NameConstraint ;
    .

    ex:NameConstraint
          a sh:PropertyShape ;
          sh:path ex:name ;
          sh:minCount 1 ;
        .
""",format="ttl")

data = rdf.Graph()
data.parse(data="""
    @prefix ex: <http://example.org/ns#> .

    ex:Bob
          a ex:Person ;
    .
""",format="ttl")

r = sh.validate(data_graph=data,shacl_graph=shapes,inference='rdfs')
print(r[2])

no validation errors are reported. In order to force the error to be recognized, I have to explicitly declare ex:Person sh:targetClass ex:Person in the shapes graph which shouldn't be necessary.

This is how TopQuadrant products represent classes and node shapes by default, so it would be great if pyshacl could support this.

ashleysommer commented 4 years ago

Hi @mgberg This looks like a similar or duplicate problem to the one reported here: https://github.com/RDFLib/pySHACL/issues/29 And since that bug was fixed, this example should work in all versions of PySHACL from v0.11.1 onwards.

Can you tell me please, what version of PySHACL are you using?

ashleysommer commented 4 years ago

@mgberg I've run your example with pyshacl v0.11.3 (and latest v0.11.3.post1) and it works as expected with output:

Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
    Severity: sh:Violation
    Source Shape: ex:NameConstraint
    Focus Node: ex:Bob
    Result Path: ex:name

If you're using a version of PySHACL older than v0.11.1, please upgrade to v0.11.1 or newer (preferrably always use the latest version).

If there is a good reason you're deliberately using an older version, let me know and I might be able to backport the owl:Class fix to it via a maintenance release.

mgberg commented 4 years ago

My computer thought v0.10 was the most up-to-date version. I upgraded and it works. Thanks!

mgberg commented 4 years ago

Hello again. If I run the following code:

shapes = rdf.Graph()
shapes.parse(data="""@prefix owl:  <http://www.w3.org/2002/07/owl#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh:  <http://www.w3.org/ns/shacl#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://example.org/ns#> .

ex:Person
        rdfs:label       "Person" ;
        rdf:type         owl:Class ;
        rdf:type         sh:NodeShape ;
        rdfs:subClassOf  owl:Thing ;
        sh:property      ex:Person-favoriteFood .

ex:Child
        rdfs:label       "Child" ;
        rdf:type         owl:Class ;
        rdf:type         sh:NodeShape ;
        rdfs:subClassOf  ex:Person .

ex:Person-favoriteFood
        rdf:type  sh:PropertyShape ;
        sh:path   ex:favoriteFood ;
        sh:class  ex:Food ;
        sh:name   "Favorite Food" .
""",format="ttl")

data = rdf.Graph()
data.parse(data="""@prefix owl:  <http://www.w3.org/2002/07/owl#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh:  <http://www.w3.org/ns/shacl#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://example.org/ns#> .

ex:Sally
        rdfs:label  "Sally" ;
        rdf:type    ex:Child ;
        ex:favoriteFood  ex:Sally .
""",format="ttl")

r = sh.validate(data_graph=data, shacl_graph=shapes, inference="both")
print(r[2])

no validation errors are reported. PySHACL is not inferring that the property shape applied to ex:Person should also be applied to ex:Child. If this is loaded in a TopQuadrant product, the error is reported correctly.

I have upgraded PySHACL to version 0.11.4.

Thanks!

ashleysommer commented 4 years ago

Hi @mgberg

You're not seeing those RDFS subclasses being applied to the data because the class relationships are declared in the SHACL Shapes file.

When you enable inferencing, the process applies only to the data graph. The class relationships declared in the SHACL file don't affect it.

This is why the "extra ontology graph" feature exists in pySHACL. You can give a third graph to the validator, this other graph can contain your extra ontological class relationships. It is mixed in with the data graph before inferencing occurs, so the RDFS inferencer has all of the required class information needed to expand the data graph.

One easy fix for your code for now is to simply feed your SHACL Shapes graph to the ont_graph argument, like this:

r = sh.validate(data_graph=data, shacl_graph=shapes, ont_graph=shapes, inference="both")
print(r[2])
mgberg commented 4 years ago

I see, that does work. Thanks!

ashleysommer commented 4 years ago

@mgberg Thanks for letting me know.

Heres an example of splitting them out into a separate graph:

shapes = rdf.Graph()
shapes.parse(data="""@prefix owl:  <http://www.w3.org/2002/07/owl#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh:  <http://www.w3.org/ns/shacl#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://example.org/ns#> .

ex:Person
        rdfs:label       "Person" ;
        rdf:type         owl:Class ;
        rdf:type         sh:NodeShape ;
        rdfs:subClassOf  owl:Thing ;
        sh:property      ex:Person-favoriteFood .

ex:Child
        rdfs:label       "Child" ;
        rdf:type         owl:Class ;
        rdf:type         sh:NodeShape ;
        rdfs:subClassOf  ex:Person .

ex:Person-favoriteFood
        rdf:type  sh:PropertyShape ;
        sh:path   ex:favoriteFood ;
        sh:class  ex:Food ;
        sh:name   "Favorite Food" .
""",format="ttl")

extra = rdf.Graph()
extra.parse(data="""@prefix owl:  <http://www.w3.org/2002/07/owl#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://example.org/ns#> .

ex:Person
        rdfs:label       "Person" ;
        rdf:type         owl:Class ;
        rdfs:subClassOf  owl:Thing .

ex:Child
        rdfs:label       "Child" ;
        rdf:type         owl:Class ;
        rdfs:subClassOf  ex:Person .

""",format="ttl")

data = rdf.Graph()
data.parse(data="""@prefix owl:  <http://www.w3.org/2002/07/owl#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh:  <http://www.w3.org/ns/shacl#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://example.org/ns#> .

ex:Sally
        rdfs:label  "Sally" ;
        rdf:type    ex:Child ;
        ex:favoriteFood  ex:Sally .
""",format="ttl")

r = sh.validate(data_graph=data, shacl_graph=shapes, ont_graph=extra, inference="both")
print(r[2])