sh:inversePath does not appear to be interpreted by current pyshacl

ajnelson-nist commented 2 years ago

I'm trying to design some rules to review domains of properties. It seemed natural to try to use sh:targetSubjectsOf and sh:inversePath to accomplish domain review. When designing some XFAIL tests, I wasn't able to trigger some XFAIL conditions I was expecting. So, the domain-review use case ends up being a bit out of scope of a potential deeper pySHACL issue, but I'll use it a bit more to describe what I suspect is the bug.

Say we have a property ex:propertyOfA:

ex:propertyOfA
  a owl:DatatypeProperty ;
  rdfs:domain ex:ThingA ;
  .

My use cases involve settings that will not be using RDFS inferencing, so I need this to pass SHACL validation ...

kb:thing-a-1
  rdf:type ex:ThingA ;
  ex:propertyOfA "1" ;
  .

...and this to fail:

kb:thing-b-1
  rdf:type ex:ThingB ;
  ex:propertyOfA "1" ;
  .

(Assume ex:ThingA have no ex:ThingB sub- or equivalent-class relationship.)

I would expect this shape to do the "domain checking" I'm interested in, but the sh:inversePath doesn't seem to be firing:

ex:propertyOfA-shape
  rdf:type sh:PropertyShape ;
  sh:class ex:ThingA ;
  sh:path [
    sh:inversePath ex:propertyOfA ;
  ] ;
  sh:targetSubjectsOf ex:propertyOfA ;
  .

Am I mixing up something in the SHACL syntax, or is there a bug in pySHACL?

A unit test is coming in a Pull Request in a moment, I just need to post this so I can get an issue number.

ajnelson-nist commented 2 years ago

Please see PR 161 for the implementation of the above scenario.

ashleysommer commented 2 years ago

Hi @ajnelson-nist Thanks for the bug report.

I've done some testing, and sh:inversePath is operating correctly in PySHACL. It passes the W3C SHACL test suite, as well as the TopQuadrant DASH SHACL Shape Test suite, both of which contain multiple tests on the functionality of sh:inversePath (it is a requirement for the classification of PySHACL to be a recognised SHACL Validation engine).

The problem you are seeing is due to your use of sh:targetSubjectsOf. In your example, it doesn't make sense to use the targetSubjectsOf targeting method as well as the sh:inversePath path selector. Both are trying to do the same thing, and cancel each other out.

Using your example:

kb:thing-a-1  ex:propertyOfA  "1"
kb:thing-b-1  ex:propertyOfA  "1"
^subject^      ^predicate^    ^object^

Using the selector sh:targetSubjectsOf with your property ex:propertyOfA, your set of focus nodes for this Shape becomes (kb:thing-a-1, kb:thing-b-1). Here you have already achieved your goal, you do not need to use inversePath in order to validate their class type.

Looking at it another way, you can still use sh:inversePath if you change your selector type, to sh:targetObjectsOf.

Using the selector sh:targetObjectsOf with your property ex:propertyOfA, your set of focus nodes for this Shape becomes ("1", "1").

Typically sh:path routes go subject -> path -> object where subject is the Focus Node, all objects are the value nodes. However with sh:inversePath, it goes object -> path -> subject where object is the Focus Node, and the subjects are the value nodes.

In this case, the focus node is "1", and set of value nodes are (kb:thing-a-1, kb:thing-b-1) that again allows this shape validation to work as expected.

To recap, to get this shape working, use one of these two forms:

ex:propertyOfA-shape
  rdf:type sh:NodeShape ;
  sh:class ex:ThingA ;
  sh:targetSubjectsOf ex:propertyOfA ;
  .

or

ex:propertyOfA-shape
  rdf:type sh:PropertyShape ;
  sh:class ex:ThingA ;
  sh:path [
    sh:inversePath ex:propertyOfA ;
  ] ;
  sh:targetObjectsOf ex:propertyOfA ;
  .

Using sh:inversePath at the same time as using targetsSubjectsOf (with the same property on both) is not required, and in this case leads to unexpected behaviour.

ajnelson-nist commented 2 years ago

@ashleysommer , thank you so much for looking this over.

I think what had gotten me hung up was this chain of thinking:

Well, I'm focusing on a property, so I better make it a sh:PropertyShape.
How to select the node...I want to focus on the triples' Subjects, so sh:targetSubjectsOf looks right.
Right! sh:PropertyShape requires sh:path. What'll go into that?

You were certainly right the behavior was unexpected. I'd pictured sh:path would start at the subject, make a "First hop" to the blank-node predicate, and then follow the sh:inversePath back. You're right, there's no "First hop."

I implemented both of your suggestions, including the sh:focusNode/sh:value reversal, and got the test passing. I had to make one extra change, though - your first form has to be a sh:NodeShape, because an sh:PropertyShape needs an sh:path.

This was good practice with sh:NodeShape. Thank you. I'm happy with this issue being closed with PR 161 being merged, if you think the test is helpful.

ashleysommer commented 1 year ago

@ajnelson-nist Yep, I see now how you got stuck.

Another way to look at this, is to consider that you will very rarely need to define a top-level shape to be a PropertyShape. PropertyShapes are almost always a blank node that is the value of a sh:property constraint, and should never have targeting instructions baked in. Targeting methods like sh:targetSubjectsOf simply don't make sense on a property shape.

Eg, instead of this:

ex:myShape
  rdf:type sh:PropertyShape ;
  sh:targetClass ex:Cat ;
  sh:path ex:numLeg;
  sh:maxInclusive 4;
  .

I find it easier to understand the shape by writing it as a NodeShape:

ex:myShape
  rdf:type sh:NodeShape ;
  sh:targetClass ex:Cat ;
  sh:property [
    sh:path ex:numLeg;
    sh:maxInclusive 4;
  ] ;
  .

That way, the you do not need to define the blank shape as a PropertyShape, or give it targeting information. It gets passed the focusNodes from the NodeShape.

To rewrite the targetObjectsOf example from the previous comments, it would look like this:

ex:propertyOfA-shape
  rdf:type sh:NodeShape ;
  sh:targetObjectsOf ex:propertyOfA ;
  sh:property [
    sh:path [
      sh:inversePath ex:propertyOfA ;
    ] ;
    sh:class ex:ThingA ;
  ] ;
  .

RDFLib / pySHACL

sh:inversePath does not appear to be interpreted by current pyshacl #160