RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
241 stars 63 forks source link

Unable to get PyShacl to detect my node shapes #186

Closed calummackervoy closed 1 year ago

calummackervoy commented 1 year ago

Reproducible below running python3 in shell:

import json
from pyshacl import validate
from rdflib import Graph

shape_graph = {
  # ...
}
data_graph = {
  # ...
}

shape_graph = Graph().parse(data=json.dumps(shape_graph), format='json-ld')
data_graph = Graph().parse(data=json.dumps(data_graph), format='json-ld')

validate(data_graph, shape_graph=shape_graph, debug=True)

Results in the output

Running validation in-place, without modifying the DataGraph.
Found 0 SHACL Shapes defined with type sh:NodeShape.
Found 0 SHACL Shapes defined with type sh:PropertyShape.
Found 0 property paths to follow.
Found 0 implied SHACL Shapes based on their properties.
Found 0 implied SHACL Shapes used as values in shape-expecting constraints.
Cached 0 unique NodeShapes and 0 unique PropertyShapes.

I know that the shape graph should work and the data graph to fail on validation from using SHACL playground

The shape I use is retrievable from this demo: https://github.com/Multi-User-Domain/vocab/blob/main/shapes/mudfantasy/vampires.ttl#L9. It's a fairly simple shape that asserts the object has a species of type "Vampire". In JSON-LD that looks like:

{
  "@context": {
    "sh": "http://www.w3.org/ns/shacl#",
    "schema": "http://schema.org/",
    "mud": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mud.ttl#"
  },
  "@graph": [
    {
      "@id": "_:g0",
      "sh:description": "Must have one species of vampire",
      "sh:in": {
        "@list": [
          {
            "@id": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudfantasy.ttl#Vampire"
          },
          "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudfantasy.ttl#Vampire"
        ]
      },
      "sh:minCount": {
        "@type": "http://www.w3.org/2001/XMLSchema#integer",
        "@value": "1"
      },
      "sh:name": "Species",
      "sh:path": {
        "@id": "mud:species"
      }
    },
    {
      "@id": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/shapes/mudfantasy/vampires.ttl#202304ParisVampire",
      "@type": "sh:NodeShape",
      "sh:property": {
        "@id": "_:g0"
      },
      "sh:targetClass": {
        "@id": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudchar.ttl#Character"
      }
    }
  ]
}

The data graph is as-follows and fails (correctly) in the playground:

{
  "@context": {
    "mud": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mud.ttl#",
    "mudfantasy": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudfantasy.ttl#"
  },
  "@id": "https://example.com/John-Doe",
  "@type": "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudchar.ttl#Character",
  "mud:species": "Not a vampire"
}
calummackervoy commented 1 year ago

I know that it's been parsed correctly by RDFLib:

>>> shape_graph.serialize()
'@prefix mud: <https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mud.ttl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n<https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/shapes/mudfantasy/vampires.ttl#202304ParisVampire> a sh:NodeShape ;\n    sh:property [ sh:description "Must have one species of vampire" ;\n            sh:in ( <https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudfantasy.ttl#Vampire> "https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudfantasy.ttl#Vampire" ) ;\n            sh:minCount 1 ;\n            sh:name "Species" ;\n            sh:path mud:species ] ;\n    sh:targetClass <https://raw.githubusercontent.com/Multi-User-Domain/vocab/main/mudchar.ttl#Character> .\n\n'
ashleysommer commented 1 year ago

Hi @calummackervoy I created a new test in the PySHACL test suite, using your shapefile and datafile as inputs, and it is working as expected:

Running validation in-place, without modifying the DataGraph.
Found 1 SHACL Shapes defined with type sh:NodeShape.
Found 0 SHACL Shapes defined with type sh:PropertyShape.
Found 0 property paths to follow.
Found 1 implied SHACL Shapes based on their properties.
Found 1 implied SHACL Shapes used as values in shape-expecting constraints.
Cached 1 unique NodeShapes and 1 unique PropertyShapes.
Will run validation on 1 named graph/s.
...

See here for the working file: https://gist.github.com/ashleysommer/0f5a0613e44bb06874cfae0e1e65c665

calummackervoy commented 1 year ago

Hi @ashleysommer thanks for helping me, using inference="none" and correcting the name of the parameter used in validate (to shacl_graph) solved my problem

ashleysommer commented 1 year ago

Ah right, I didn't even notice your typo (shape_graph), I have been banging my head trying to work out why my version works and yours doesn't.

calummackervoy commented 1 year ago

Oh, sorry about that! I wasted a couple of hours of my own time as well

calummackervoy commented 1 year ago

I suppose that the library could fail loudly when there's no graph to validate against. I could look into opening a PR for that if you agree

ashleysommer commented 1 year ago

In a lot of cases people mix their SHACL shapes and datagraph into the same graph (even the W3C SHACL SHT test suite does this for it's test cases). So when you pass a datagraph but no SHACL Shapes graph, the validator assumes it needs to look in the datagraph for shapes. That is what is happening here.

I suppose it should emit a debugging message when that happens, for clarity.