RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
246 stars 63 forks source link

Specify focus node as cli parameter #53

Closed michielhildebrand closed 3 years ago

michielhildebrand commented 4 years ago

The SHACL recommendation states the following option to define a focus node: specified as explicit input to the SHACL processor for validating a specific RDF term against a shape https://www.w3.org/TR/shacl/#focusNodes

It would be great to have this feature.

BTW. Great project! It got me started with SHACL in minutes.

ashleysommer commented 4 years ago

Hi @michielhildebrand Sorry I missed this issue thread, just going through past email notifications now and found it.

Specifying a focus node has always been a feature the library can do but I never considered would be something a user would want to do via the CLI, but I can see it might be useful.

How do you envisage it working?

Would you want to specify a SHACL Shapes graph, a Data Graph, and the URI for a focus node?

Something like this? pyshacl -s myshapes.ttl mydata.ttl <http://example.org#myNode> Where <http://example.org#myNode> is a node within mydata.ttl graph? What about specifying more than one focus node, should that be possible?

vangoghworldwide commented 4 years ago

Hi, Your proposal to add the focusNode in the command line would partially work for me. My use cases seems to be more complicated then I initially thought.

The use case is as follows. I started with a nodeShape with a targetClass: vgw:Artwork a sh:NodeShape ; sh:targetClass crm:E22_Human-Made_Object ; ...

The problem is that in the graph there are more resources of type crm:E22_Human-Made_Object and I only want to test a specific one. I changed it to use a targetNode

vgw:Artwork a sh:NodeShape ; sh:targetNode http://vangoghmuseum.nl/data/artwork/d0005V1962 ;

Now this works fine. The problem is that other users of this shapes graph need to change it and add their own URI in the targetNode.

Specifying the focusNode in the command line (or library api) makes this more convenient. The complication is that the focusNode is not "bound" to the vgw:Artwork node shape any more. I could put back the "sh:targetClass crm:E22_Human-Made_Object". But then I am back at the problem that this shape will be called for other resources as well. In particular because there are resources related to my focusNode of this type that I also want to test, but then with a sh:node. So it sounds like I need to call it with a focusNode and a nodeShape. Does this make any sense? Or did I miss something?

ashleysommer commented 4 years ago

Hi @vangoghworldwide

Does this make any sense? Or did I miss something?

Yes I believe I understand what you are saying. And it does bring up an obvious problem with the solution I proposed. That is; when you specify a focusNode to validate, how does the validator know which SHACL shapes to apply?

The way PySHACL works internally, is it goes through all of the known SHACL Shapes in the shapes graph, and for each shape it determines which nodes in the DataGraph to apply the shape to. Those are the focus nodes. It cannot go the other way around, there is no mechanism in PySHACL for given a focusNode, find the SHACL Shapes which apply to it. Without that, even your example using sh:targetClass wouldn't work.

I will need to think about this a bit more. Perhaps the solution would be to implement an alternate mode for PySHACL to operate, where it can be given a focusNode and a NodeShape to use, and it does not use the normal PySHACL algorithm at all.

michielhildebrand commented 4 years ago

I understand. Thanks for explaining. We are exploring alternative solutions to define the targets better. I think we will manage in that way. You can park this issue if you like.

ashleysommer commented 4 years ago

No problems. We'll put this on hold for now, and think about it further down the track.

nicholascar commented 3 years ago

Just a note that Cheka allows for this in a round-about way. There you can indicate a node to be validated by including in the data a conformance claim, i.e. node X claims conformance to some profile and the profile lists all the Shapes that are defined for conformance testing to the profile.