https://mybinder.org/v2/gh/hsolbrig/pyshex/master
This package is a reasonably literal implementation of the Shape Expressions Language 2.0. It can parse and "execute" ShExC and ShExJ source.
test_manifest.py
for detailspip install PyShEx
Note: If you need to escape single quotes in RDF literals, you will need to install the bleeding edge of rdflib:
pip uninstall rdflib
pip install git+https://github.com/rdflib/rdflib
Unfortunately, however, rdflib-jsonld
is NOT compatible with the bleeding edge rdflib, so you can't use a json-ld parser in this situation.
> shexeval -h
usage: shexeval [-h] [-f FORMAT] [-s START] [-ut] [-sp STARTPREDICATE]
[-fn FOCUS] [-A] [-d] [-ss] [-cf] [-sq SPARQL] [-se]
[--stopafter STOPAFTER] [-ps] [-pr] [-gn GRAPHNAME] [-pb]
rdf shex
positional arguments:
rdf Input RDF file or SPARQL endpoint if slurper or sparql
options
shex ShEx specification
optional arguments:
-h, --help show this help message and exit
-f FORMAT, --format FORMAT
Input RDF Format
-s START, --start START
Start shape. If absent use ShEx start node.
-ut, --usetype Start shape is rdf:type of focus
-sp STARTPREDICATE, --startpredicate STARTPREDICATE
Start shape is object of this predicate
-fn FOCUS, --focus FOCUS
RDF focus node
-A, --allsubjects Evaluate all non-bnode subjects in the graph
-d, --debug Add debug output
-ss, --slurper Use SPARQL slurper graph
-cf, --flattener Use RDF Collections flattener graph
-sq SPARQL, --sparql SPARQL
SPARQL query to generate focus nodes
-se, --stoponerror Stop on an error
--stopafter STOPAFTER
Stop after N nodes
-ps, --printsparql Print SPARQL queries as they are executed
-pr, --printsparqlresults
Print SPARQL query and results
-gn GRAPHNAME, --graphname GRAPHNAME
Specific SPARQL graph to query - use '' for any named
graph
-pb, --persistbnodes Treat BNodes as persistent in SPARQL endpoint
See: examples Jupyter notebooks for sample uses
The root pyshex
package is subdivided into:
The ShEx schema definitions for this package come from ShExJSG
We are trying to keep the python as close as possible to the (semi-)formal specification. As an example, the statement:
Se is a ShapeAnd and for every shape expression se2 in shapeExprs, satisfies(n, se2, G, m)
is implemented in Python as:
...
if isinstance(se, ShExJ.ShapeAnd):
return satisfiesShapeAnd(cntxt, n, se)
...
def satisfiesShapeAnd(cntxt: Context, n: nodeSelector, se: ShExJ.ShapeAnd) -> bool:
return all(satisfies(cntxt, n, se2) for se2 in se.shapeExprs)
This package is built using:
This implementation passes all of the tests in the master branch of validation/manifest.ttl with the following exceptions:
At the moment, there are 1088 tests, of which:
rdflib
does not preserve bnode "identity")
2) (18) sht:Import Uses ShEx 2.1 IMPORT feature -- not yet implemented (three aren't tagged)
3) (3) Uses manifest shapemap feature -- not yet implemented
4) (2) sht:relativeIRI -- this isn't a real problem, but we havent taken time to deal with this in the test harness
5) (6) rdflib
has a parsing error when escaping single quotes. (Issue submitted, awaiting release)As mentioned above, at the moment this is as literal an implementation of the specification as was sensible. This means, in particular, that we are less than clever when it comes to partition management.
docker build -t pyshex docker
docker run --rm -it pyshex -gn '' -ss -ut -pr -sq 'select distinct ?item where{?item a <http://w3id.org/biolink/vocab/Gene>} LIMIT 1' http://graphdb.dumontierlab.com/repositories/ncats-red-kg https://github.com/biolink/biolink-model/raw/master/shex/biolink-modelnc.shex