RDFLib / pySHACL

A Python validator for SHACL
Apache License 2.0
245 stars 63 forks source link

pySHACL just a 'driver'? #174

Closed majidaldo closed 1 year ago

majidaldo commented 1 year ago

given some discussion, would it make sense to just have this project just drive some graph implementation via sparql?

on a related note, the most core functionality exhibited here is that it's some entailment engine based on shacl but it could also be based on sparql . (shacl constraints/rules can be sparql ask/construct right?)

ashleysommer commented 1 year ago

Hi @majidaldo You are right that it is possible to implement all SHACL constraints using standardized SPARQL implementations. Indeed the W3C SHACL Spec document shows example SPARQL implementations of each constraint. So it would be simple to use those given.

PySHACL operates in exactly the opposite manner, and does so deliberately. PySHACL uses the python RDFLib library under the hood, to perform RDF functionality. The SPARQL parser and SPARQL execution engine in RDFLib are famously slow. It is a design goal since the very early stages of PySHACL to prioritise speed of execution, so it is a deliberate decision to avoid using the SPARQL engine as much as possible. Each constraint is hand-written and hand-tweaked using Python to perform raw graph lookups, to extract maximum performance from the validation engine.

I agree that developing a new implementation of a Python SHACL Validation engine that operates as simply a driver over a set of canonical SPARQL Constraints from a given managed constraint registry, sounds like a great project to undertake, and would be useful to have to compare engines. However PySHACL is not the place to do that.