Closed tduval-unifylogic closed 1 year ago
Hi @tduval-unifylogic
PySHACL only validates data against SHACL constraints. rdfs:range
is helpful for describing a RDFS Schema or Ontology, but it is not a SHACL constraint, and PySHACL ignores it in this example.
To achieve what you want in your example here, you need the SHACL ClassConstraintComponent, it validates that a given property is an instance of Person
or instance of subclass of Person
.
A working example shapefile would look something like this:
@prefix schema: <http://schema.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
schema:Organization a owl:Class .
schema:attendee a owl:ObjectProperty ;
schema:Event a owl:Class, sh:NodeShape ;
sh:name "AttendeeIsPerson" ;
sh:description "Validates that an event's attendees are instance of class Person." ;
sh:property [
sh:path schema:attendee ;
sh:class schema:Person
] .
schema:Person a owl:Class, sh:NodeShape ;
sh:name "AttendeeOnEvent" ;
sh:description "Validates that this person was attendee of instance of Event." ;
sh:property [
sh:path [ sh:inversePath schema:attendee ] ;
sh:class schema:Event
] .
Or if you want to more accurately match the semantics of rdfs:domain
and rdfs:range
, instead of targeting the classes, you can target using the attendee
property itself, like this:
@prefix schema: <http://schema.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
schema:Organization a owl:Class .
schema:Person a owl:Class .
schema:Event a owl:Class .
schema:attendee a owl:ObjectProperty ;
schema:attendeeIsPerson a sh:NodeShape ;
sh:targetObjectsOf schema:attendee ;
sh:description "Validates that an event's attendees are instance of class Person." ;
sh:class schema:Person .
schema:attendeeOnEvent a sh:NodeShape ;
sh:targetSubjectsOf schema:attendee ;
sh:description "Validates that this person was attendee of instance of Event." ;
sh:class schema:Event .
@ashleysommer : I think @tduval-unifylogic 's example isn't ignored by SHACL. There were three things that look to me like they interact and cause the "incorrect" (to eyes) passing validation result.
The three things are these two triples:
schema:attendee rdfs:range schema:Person .
:event1 schema:attendee :organization1 .
and the inference='both'
parameter on the call to validate
.
If inference were none
, the example would have raised a validation error. But with inference='both'
, an additional triple is inferred/expanded/entailed, sufficiently from either RDFS entailment or OWL entailment (IIRC, either entailment scheme would cause this; at least RDFS does):
:organization1 a schema:Person .
So, the SHACL property shape from @tduval-unifylogic is satisfied after RDFS (and/or OWL) entailment has occurred.
(This next comment is just extra fun on top of the diagnostics.)
I poked around the schema.org documentation, and found that while they have an OWL "render" of the schema.org vocabulary available (see this page, end of "Experimental" section), it includes no OWL disjointedness statements. So, it is OWL-consistent with schema.org
that you could have a thing that is both a schema:Person
and schema:Organization
.
For further reference, here is the OWL definition of schema:attendee
from their schema, version 15.0.
schema:attendee
a owl:ObjectProperty ;
rdfs:label "attendee"@en ;
rdfs:comment "A person or organization attending the event."@en ;
rdfs:domain [
a owl:Class ;
owl:unionOf (
schema:Event
) ;
] ;
rdfs:isDefinedBy schema:attendee ;
rdfs:range [
a owl:Class ;
owl:unionOf (
schema:Organization
schema:Person
schema:Text
schema:URL
schema:Role
) ;
] ;
.
Note that there is a difference in rdfs:range
between what schema.org provides and what was in the initial example. (And, hm, there also appears to be a bug somewhere that mixed in a few classes in the range.) The non-OWL definition of schema:attendee
avoids the rdfs:range
conflict by using schema:rangeIncludes
instead, which avoids RDFS entailment issues, but means schema.org needs to define their own entailment system:
schema:attendee
a rdf:Property ;
rdfs:label "attendee" ;
rdfs:comment "A person or organization attending the event." ;
schema:domainIncludes schema:Event ;
schema:rangeIncludes
schema:Organization,
schema:Person
;
.
One last point on disjointedness and RDFS expansion: schema.org
lacking owl:disjointWith
statements entirely means it is also OWL-consistent with the schema.org
vocabulary that you could have a thing that is both a schema:Person
and schema:AMRadioChannel
. rdfs:domain
and rdfs:range
could somehow end up causing such an inference. That is one reason to choose among methods for how to encode your properties (rdfs:range
? schema:rangeIncludes
? http://purl.org/dc/dcam/rangeIncludes
?), and whether your model should include ontological practices that include some foundational disjoint classes, as well as a mechanism for detecting OWL consistency (namely that no individual is a member of two disjoint classes).
@ajnelson-nist thanks so much for your explanation! It makes total sense now and I am getting the result I expect...
Shameless plug (since you folks are into reasoners): Please check out the recently published N3 Builtin Functions Documentation: https://domel.github.io/n3builtins/specification/
@tduval-unifylogic you're welcome.
Meanwhile, I've filed a bug on the range expansion issue that came up in discussion, here.
@ajnelson-nist @tduval-unifylogic
There were three things that look to me like they interact and cause the "incorrect" (to eyes) passing validation result. If inference were none, the example would have raised a validation error.
I agree with you that there is an issue with rdfs:range and rdfs:domain in schema.org, and it is something they need to address.
However it doesn't change my original answer. There are only two SHACL Shapes in the example given. One is on schema:Event
and the other is on schema:Person
. The first shape has only one constraint, it is the sh:property
constraint, and it triggers the second shape schema:Person
, that has no SHACL constraints defined on it. The rdfs:range
and rdfs:domain
declarations are not SHACL constraints, and are not used by PySHACL.
PySHACL sees no constraints to run, so the result is a passing validation result. Whether 'rdfs' inferencing is used or not is irrelevant, because when running this example in a python debugger and stepping through the code, it is easy to see the validator exists early with a passing validation result after finding no constraints to test.
@ashleysommer : Ah! You're right. I saw this:
schema:Event a owl:Class, sh:NodeShape ;
sh:property [
sh:path schema:attendee ;
sh:node schema:Person
] .
and missed that the PropertyShape
is using sh:node
. I thought it said sh:class
. So, that was a misread on my part, and the basis for the rest of my remark. Sorry for the confusion.
Greetings. I really like pySHACL, but I have a small issue. It's likely I'm doing something incorrect. For some reason, I looked at issue #40 which looks like it has range validation in it, and attempted to recreate with a simple ontology and data graph, but I am not getting a validation error on :event2's range?
Any insight would be helpful. Thank you.
Here is my code:
Python 3.9.6 rdflib 6.1.1 pySHACL 0.20.0
Here is the result I get: