Given (i) an RdfStore (see rdf-stores
) of triples, (ii) an RdfStore with a SHACL shape’s triples, and (iii) a target entity URI,
this library will extract all triples that belong to the entity.
If more triples of the entity are needed, extra triples are retrieved by dereferencing the relevant entity.
The algorithm is a proposal to be standardized as part of W3C’s TREE hypermedia Community Group as the member extraction algorithm. This algorithm needs to be efficient and unambiguously defined, so that various implementations of the member extraction algorithm will result in the same set of triples. As a trade-off, the resulting set of triples is not guaranteed to be validated by the SHACL shape.
The algorithm is inspired by, and an in-between between CBD and Shape Fragments, thanks to Thomas Bergwinkl and his blog post on a SHACL engine.
npm install extract-cbd-shape
import {CBDShapeExtractor} from "extract-cbd-shape";
// ...
let extractor = new CBDShapeExtractor(shapesGraph);
let entityquads = await extractor.extract(store, entityId, shapeId, graphsToIgnore);
Tests and examples provided in the tests library. Run them using mocha which can be invoked using npm test
This is an extension of CBD. It extracts:
To be discussed:
The first focus node is set by the user. 1a. If a shape is set, create a shape template and execute the shape template extraction algorithm 1b. If no shape was set, extract all quads with subject the focus node, and recursively include its blank nodes (see also CBD)
The Shape Template is a structure that looks as follows:
class ShapeTemplate {
closed: boolean;
requiredPaths: Path[];
optionalPaths: Path[];
nodelinks: NodeLink[];
atLeastOneLists: [ Shape[] ];
}
class NodeLink {
shape: ShapeTemplate;
path: Path;
}
Paths in the shape templates are SHACL Property Paths.
A Shape Template has
Note: Certain quads are going to be matched by the algorithm multiple times. Each quad will of course be part of the member only once.
This results in this algorithm:
If there’s a shape set, the SHACL shape MUST be processed towards a Shape Template as follows:
:S sh:deactivated true
), if it is, don’t continue:S sh:closed true
), set the closed boolean to true.sh:property
elements with an sh:node
link are added to the shape’s NodeLinks arraysh:minCount
> 0 to the Required Paths array, and all others to the optional paths.sh:xone
, sh:or
and sh:and
(but doesn’t process sh:not
):
sh:and
: all properties on that shape template MUST be merged with the current shape templatesh:xone
and sh:or
: in both cases, at least one item must match at least one quad for all required paths. If not, it will do an HTTP request to the current namednode.Note: The way we process SHACL shapes into Shape Template is important to understand in order to know when an HTTP request will be triggered when designing SHACL shapes. A cardinality constraint not being exactly matched or a sh:pattern
not being respected will not trigger an HTTP request, and instead just add the invalid quads to the Member. This is a design choice: we only define triggers for HTTP request from the SHACL shape to come to a complete set of quads describing the member the data publisher pointed at using tree:member
.
Note: it only takes hints (it does not guarantee a result that validates) from an optional SHACL shapes graph. It only uses the parts relevant for discovery from the SHACL Core Constraint Components. It does not support SPARQL or Javascript.
It won’t:
sh:class
, inLanguage, pattern, value, qualified value shapes, etc. It is the data publisher’s responsibility to provide valid data, or it is the responsibility of the user of the library to validate the quads afterwards.TODO
Logging can be enabled using the DEBUG
environment variable, DEBUG=extract-cbd-shape:*
.