Closed jbkoh closed 3 years ago
Hi @jbkoh I'm going to attempt a replication and investigation of this now. I'll let you know what I find.
Ok, I've looked into it, and it looks like its not a bug in PySHACL. WARNING: long reply! This will probably contain minor errors, and I will probably go back and fix some paragraphs later.
For sanity checking, I ran the same validation through the online SHACL Playground, which uses a completely independent SHACL engine called shacl-js, and it gives the same result that PySHACL does.
The main problem is to do with how PropertyShapes interact with logical constraints like sh:not
and sh:or
. In short, it is very difficult to invert the result of a PropertyShape
(using sh:not
) and get the outcome you are expecting.
Stepping it through, the PropertyShape
with sh:path
("rdf:type") generates a focusNode
of ex:aaa
and valueNodes
of (ex:AAA, ex:BBB, ex:CCC)
.
sh:or
is what is termed a "Shape-expecting constraint". That means it contains within it a set of shapes, and executes those shapes repeatedly for every valueNode
that is passed into it.
So these values nodes then get passed into the sh:or
constraint to test against the sh:hasValue
constraints.
It goes like this:
ex:AAA sh:hasValue ex:BBB
-> False
ex:AAA sh:hasValue ex:CCC
-> False
Both of those or'd together = False.
ex:BBB sh:hasValue ex:BBB
-> True
ex:BBB sh:hasValue ex:CCC
-> False
Both of those or'd together = True.
ex:CCC sh:hasValue ex:BBB
-> False
ex:CCC sh:hasValue ex:CCC
-> True
Both of those or'd together = True.
That means that even though the second and third valueNode passes the sh:or
check, the first one never does, which means the overall validation result for the PropertyShape
itself is False.
Then the non-comformant PropertyShape
gets inverted by sh:not
and the whole NodeShape
passes as valid, that is the result we are seeing in both PySHACL and in the SHACL Playground.
Hacking in some debugging output into the PySHACL code, I generated this output which might help to illustrate the problem:
[<Shape p=False node=https://example.com#Shape>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Fails
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Fails
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Fails
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL15C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>, <OrConstraintComponent>, <Shape p=False node=ub1bL16C13>] Fails
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=True node=ub1bL11C12>] Fails
[<Shape p=False node=https://example.com#Shape>] Passes
You can see the sh:hasValue
constraint is executed 6 times (twice for each valueNode
of sh:path
). This is not what we want.
There are however a couple of different ways we can achieve what you're trying to do:
1) Simplest change - Move the property shapes into the sh:or
ex:Shape a sh:NodeShape ;
sh:targetClass ex:AAA;
sh:not [
sh:or (
[sh:path rdf:type; sh:hasValue ex:BBB;]
[sh:path rdf:type; sh:hasValue ex:CCC;]
)
] .
sh:hasValue
is not a "Shape-Expecting constraint". It has no child shapes.
In this case, having the sh:hasValue
on the same shape as sh:path
, means it can take all of the valueNodes at the same time, and produce a single result:
(ex:AAA, ex:BBB, ex:CCC) sh:hasValue ex:BBB
= True
(ex:AAA,, ex:BBB, ex:CCC) sh:hasValue ex:CCC
= True
Both of these OR'd together = True
Invert that with NOT, you get False, which is what we are looking for.
This time the debug output looks like this:
[<Shape p=False node=https://example.com#Shape>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>, <OrConstraintComponent>, <Shape p=True node=ub1bL24C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>, <OrConstraintComponent>, <Shape p=True node=ub1bL24C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>, <OrConstraintComponent>, <Shape p=True node=ub1bL23C13>] Start
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>, <OrConstraintComponent>, <Shape p=True node=ub1bL23C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <NotConstraintComponent>, <Shape p=False node=ub1bL21C12>] Passes
[<Shape p=False node=https://example.com#Shape>] Fails
You can see this time sh:hasValue
is only executed twice, which is all is needed to determine conformance.
2) Similar to the above, but flipping some logic around. Move the sh:not
into the sh:or
, but change sh:or
to sh:and
.
ex:Shape a sh:NodeShape ;
sh:targetClass ex:AAA;
sh:and (
[sh:not [sh:path rdf:type; sh:hasValue ex:BBB;]]
[sh:not [sh:path rdf:type; sh:hasValue ex:CCC;]]
).
This is exactly the same logic as the above, and gives the same result, but due to flipped sh:and
and sh:not
, it is executed differently within PySHACL.
[<Shape p=False node=https://example.com#Shape>] Start
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL15C5>] Start
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL15C5>, <NotConstraintComponent>, <Shape p=True node=ub1bL15C13>] Start
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL15C5>, <NotConstraintComponent>, <Shape p=True node=ub1bL15C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL15C5>] Fails
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL12C5>] Start
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL12C5>, <NotConstraintComponent>, <Shape p=True node=ub1bL12C13>] Start
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL12C5>, <NotConstraintComponent>, <Shape p=True node=ub1bL12C13>] Passes
[<Shape p=False node=https://example.com#Shape>, <AndConstraintComponent>, <Shape p=False node=ub1bL12C5>] Fails
[<Shape p=False node=https://example.com#Shape>] Fails
3) Use sh:class instead of PropertyShape and Path.
I get that there is a probably a good reason you're using sh:path rdf:type
but there is the built-in sh:class
mechanism in PySHACL that will do this (and it also checks one-level of subclass for free too).
ex:Shape a sh:NodeShape ;
sh:targetClass ex:AAA;
sh:and (
[sh:not [sh:class ex:BBB]]
[sh:not [sh:class ex:CCC]]
).
You can see this still uses the sh:and
and sh:not
pattern same as above, removes the need to have any PropertyShape
s.
4) Complete the loop
Now we're using sh:class
we don't have a PropertyShape
, so we can go back to using the orignal sh:not
and sh:or
setup:
ex:Shape a sh:NodeShape ;
sh:targetClass ex:AAA;
sh:not [
sh:or (
[sh:class ex:BBB]
[sh:class ex:CCC]
)
].
This is probably the form I'd use in this situation.
Hope this helped!
Thanks a lot for the detailed answers! Learned a lot about the logic and alternatives. The key point seems to be that the rdf:type ex:AAA
is not treated specially in the sh:or
logic, which makes sense. Your option 4 is the most attractive to me as it's semantically what I want to represent. I just didn't know sh:class
can be applied to NodeShape
as well.
Again, thanks a lot for the quick and detailed response!
Hi there, I'm trying to implement validation like
owl:disjointWith
in SHACL. To do that, I'm applyingsh:not
withsh:or
as in the below example.shape graph
However, the data graph with an instance whose type is both
AAA
andBBB
is not detected as not conforming.The validation code with data graph:
This is failing by not generating any validation errors. The shape works without
sh:or
, so I'm curious whether I misunderstood the concept ofsh:or
or there is a possible bug.Any input would be appreciated. Thanks a lot.