ISAITB / shacl-validator

Web and command-line application for the validation of RDF data.
https://joinup.ec.europa.eu/collection/interoperability-test-bed-repository/solution/rdf-validator
European Union Public License 1.2
13 stars 1 forks source link

Doing validation with query fails for properties with a sh:class restriction #1

Closed Roensby closed 2 years ago

Roensby commented 2 years ago

I haven't fully wrapped my head around the SHACL validator yet, so this may just be a simple misunderstanding of what it can do.

Let's say that I load a triplestore with the schema.org models.

I then create a schema:Person, say

<http://example.org/bob> schema:name "Bob"
<http://example.org/bob> rdf:type schema:Person

Then I create a simple SHACL shape for schema:Event that restricts the schema:organizer to a schema:Person:

schema:Event
  a rdfs:Class ;
  a sh:NodeShape ;
  rdfs:label "Event" ;
  sh:property [
    sh:path schema:organizer ;
    sh:class schema:Person ;
  ] ;
.

I have hooked up the SHACL validator to the triplestore and I now use the SHACL query validation tool with a CONSTRUCT query to validate a schema:Event, referencing Bob as organizer:

CONSTRUCT { 
    <http://example.org/my-event> rdf:type <https://schema.org/Event> . 
    <http://example.org/my-event> schema:organizer ?newValueForOrganizer . 
} WHERE {
    VALUES ( ?newValueForOrganizer ) { ( <http://example.org/bob> ) } .
}

I would expect this to pass validation, since the triplestore has Bob stored as a schema:Person. However, it fails:

 Value must be an instance of schema:Person
Location:[Focus node] - [http://example.org/my-event] - [Result path] - [https://schema.org/organizer]
Test:[Value] - [http://example.org/bob]

Am I expecting too much in terms of the SHACL validator resolving <http://example.org/bob> and discovering its rdf:type via the connection to the triplestore?

In any case, thank you for your work!

Roensby commented 2 years ago

It seems that if I include a reference in the CONSTRUCT query to the type of the value, it validates successfully. I'll close it, since it seems to solve the issue.

Does not validate, because the validator cannot resolve the rdf:type of Bob:

CONSTRUCT { 
    <http://example.org/my-event> rdf:type schema:Event . 
    <http://example.org/my-event> schema:organizer ?organizer . 
} WHERE {
    ?organizer schema:name "Bob" .
}

Does validate, because the validator is given the type of Bob to construct with:

CONSTRUCT { 
    <http://example.org/my-event> rdf:type schema:Event . 
    <http://example.org/my-event> schema:organizer ?organizer . 
        ?organizer rdf:type ?organizerType .
} WHERE {
    ?organizer schema:name "Bob" .
        ?organizer rdf:type ?organizerType
}
costas80 commented 2 years ago

You are right @Roensby in your earlier comment. The fact that you have additional information in your triple store does not mean that this is somehow available to the validator. The validator simply executes a SPARQL CONSTRUCT query to get a graph which should contain all the context information needed to correctly validate against it's shapes. As you correctly point out, in this example the query should also return the type of "Bob", otherwise the validator has no way of knowing about it.

This in fact is more a general point regarding validation with SHACL shapes (and not one specific to the ITB's SHACL validator): you need to always ensure that all context that would be needed, either in the input data or the shapes themselves is either configured in or provided to the validator. Taking foaf as a common example, if you are writing a shape that verifies something is a foaf:Agent and you use it to validate a foaf:Person (a subclass of foaf:Agent), the validator would need to also be configured with the foaf vocabulary to understand that something which is a foaf:Person is also a foaf:Agent (i.e. you would provide the foaf vocabulary via a validator.shaclFile.XYZ config entry either locally or remotely).