RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
246 stars 63 forks source link

How to validate a shape graph? #65

Closed jbkoh closed 3 years ago

jbkoh commented 3 years ago

I'm not sure if it's within the scope of pySHACL project, but still wonder if I can get some advice.

What's the best way of validating a shape graph? I created a large shape graph and trying to validate if the shape graph itself has any conflicts or not. For example, within a shape, there can be conflicting sh:property with different cardinalities (e.g., minCount=3 but maxCount=2. I am trying to detect those at scale, and for now I just create random values for the shapes and then run pyshacl.validate(). However, there are just too many variations that I need to cover to generate the random values.

Is there a better way of achieving the goal? I wish there is something like validate(shacl_graph, shacl_graph=shacl_graph).

ashleysommer commented 3 years ago

Hi @jbkoh Is the metashacl feature as described in the README.md document what you're looking for?

jbkoh commented 3 years ago

It's similar but I think is different. Given a correctly formatted shape graph, I would like to validate if the contents are conforming with each other. Here's an example shape graph that I would like to validate.

:shape1 a sh:NodeShape;
  sh:targetClass :SomeClass;
  sh:property [
    sh:path :somePath;
    sh:datatype xsd:string;
    sh:maxCount 1
  ];
  sh:property [
    sh:path :somePath;
    sh:datatype xsd:string;
    sh:minCount 3
  ].

Here, even though it's correctly formed (I think this is conforming shacl-shacl constraints, but let me know otherwise.), one property is conflicting each other because there is no way to meet both cases by the different cardinalities.

Another use case is

:shape2 a sh:NodeShape;
  sh:targetClass :SomeClass;
  sh:property [
    sh:path :somePath;
    sh:or ([sh:class :ClassA] [sh:class :ClassB]);
  ];
  sh:property [
    sh:path :somePath;
    sh:class :ClassA.
  ].

If ClassB is introduced by the first property, it cannot satisfy the other Property Shape.

Those are common mistakes I've seen from my shape design, and I wonder if there is an automated way of detecting such things.

Let me know if meta-shacl is still the right solution. (I just tried it out but maybe I was using it incorrectly.)

JoelBender commented 3 years ago

meta-shacl is about testing the shape graph to see if it is well formed, I think you are looking for testing to see if a particular shape is satisfiable or unsatisfiable, which is to say that there does not exist a data graph in which the shape will validate. The two examples you gave are actually quite different because the first is unsatisfiable while the second one is satisfiable, the two sh:property statements are not contradictory. There's some very old brain cell kicking around that says that the existence of sh:not makes the problem of checking a collection of statements very hard (something about being NP-Hard) but it's been too long!

ashleysommer commented 3 years ago

Yeah, that kind of higher-level logic checking of the Shapes graph is definitely out of the scope of pyshacl sorry. I don't know if any of the big-name SHACL validators would be able to do thank kind of sanity checking. I'd guess that writing a tool to check for sanity and satisfy-ability of the Shape graph would be harder than writing the SHACL validator itself.

jbkoh commented 3 years ago

@JoelBender Thanks for the insight. I should've taken discrete logic classes to give a better description like yours :)

@ashleysommer Thanks for the confirmation. I will stick to the logic I've been using.

I'm closing this issue.