ruby-rdf / shacl

Implementation of Shapes Constraint Language (SHACL) for RDF.rb
The Unlicense
8 stars 3 forks source link

SHACL-SPARQL #3

Closed saumier closed 1 month ago

saumier commented 2 years ago

It would be very useful to extend the capabilities of SHACL to include SPARQL based contraints.

My specific use case is to detect duplicate values of a property across all entities in a data graph, which is something SHACL Core cannot do AFAIK.

This feature request is referring to the draft SHACL Advanced Features 1.1 Specification https://w3c.github.io/shacl/shacl-af/

gkellogg commented 2 years ago

I’ll look into it. I just finished a major update to the SPARQL gem, so it’s a good time to look back into shacl.

gkellogg commented 2 years ago

Much of what you want should now be on the develop branch, but depends on unreleased features added to the rdf and sparql gems. Try running with a Gemfile based on the one in this repository.

More work to do for SPARQL-based Constraint Components and more Advanced Features. You should be able to use use all the features described in SPARQL-based Constraints.

saumier commented 2 years ago

Amazing. I’ll test out some SPARQL based contraints that I need in my projects and let you know how it goes :-)

saumier commented 2 years ago

Hi. I have a simple SHACL-SPARQL in this repo that is not working as expected. Running bundle exec ruby test-country.rb does not produce a shacl:Violation with the sh:SPARQLConstraint. The example is text book. I am loading 'develop' branch gems. So for now I am blocked.

gkellogg commented 2 years ago

Thanks for the example. That essentially duplicates a passing test case from the shacl test suite, so may be some other setup issues. Should have it straightened later today.

gkellogg commented 2 years ago

I notice that the property shape includes sh:minCount 2, and none of the tested objects has more than one property value for ex:germanLabel. If you remove that constraint, your test passes, but it ends up that's a false positive.

Additionally, the query uses the prefix ex, but it is not defined. In the original test, it would be defined where referenced using something like:

<http://datashapes.org/sh/tests/sparql/node/sparql-003.test>
  sh:declare [
      rdf:type sh:PrefixDeclaration ;
      sh:namespace "http://datashapes.org/sh/tests/sparql/node/sparql-003.test#"^^xsd:anyURI ;
      sh:prefix "ex" ;
    ] ;

In the test suite, the data graph and the shapes graph are combined, and this was masking a problem creating an aggregated repo containing them both (which is required for some use cases). It ends up you're missing a develop gem:

gem 'rdf-aggregate-repo', git: 'https://github.com/ruby-rdf/rdf-aggregate-repo.git', branch: 'develop'

Try again using the develop version of rdf-aggregate-repo, and either adding property values to satisfy minCount, or remove that restriction, as well as either making the PREFIX definition explicit in the query, or with a properly scoped sh:declare. You can also add a logger: option to SHACL.open (should/will be able to with shacl.execute that can emit some runtime detail and/or add SXP::Genrator.print(shacl.to_sxp_bin) to get some insight into the parsed SHACL constraints.

(shapes
 (
  (NodeShape
   (id <http://example.com/ns#LanguageExampleShape>)
   (type shacl:NodeShape)
   (targetClass <http://example.com/ns#Country>)
   (PropertyShape (path <http://example.com/ns#germanLabel>))
   (sparql
    (type shacl:SPARQLConstraint)
    (message "Values are literals with German language tag.")
    (prefix
     ((ex: <http://example.com/ns#>))
     (project
      (?this ?value ?path)
      (extend
       ((?path ex:germanLabel))
       (filter
        (|| (! (isLiteral ?value)) (! (langMatches (lang ?value) "de")))
        (bgp (triple ?this ex:germanLabel ?value))) )) )) )) )

Result I get after removing the minCount constraint is:

false
Result for: "Spain"@en
  focus: <http://example.com/ns#InvalidCountry>
  path: <http://example.com/ns#germanLabel>
  shape: <http://example.com/ns#LanguageExampleShape>
  resultSeverity: shacl:Violation
  component: shacl:SPARQLConstraintComponent
  message: "Values are literals with German language tag."
saumier commented 2 years ago

Thanks for your detailed response. I was able to get it to work as expected. My problem was the undefined PREFIX ex: that you pointed out. I had the PREFIX ex: defined in the shape turtle, but not scoped or defined explicitly in the SPARQL query. Also, thanks also for the logger: option. Very helpful.