zazuko / rdf-validate-shacl

Validate RDF data purely in JavaScript. An implementation of the W3C SHACL specification on top of the RDFJS stack.
MIT License
98 stars 13 forks source link

[Feature request] Extracting shapes from triples #48

Open pietercolpaert opened 3 years ago

pietercolpaert commented 3 years ago

I’m looking for a library that can help me to extract triples adhering to a SHACL shapes from an array of triples.

Could look like this:

const data = await loadDataset('my-data.ttl')
const shapeData = await loadDataset('personShape.ttl')
const objectIterator = extractObjects(data, shapeData, "ex:PersonShape1")
for await (const object of objectIterator) {
    console.log(object); // contains an occurence of array of triples according to this specific shape.
}

Use case issue: https://github.com/TREEcg/event-stream-client/issues/2

bergos commented 3 years ago

I think there are two different ways to approach this:

a) A separate method for the SHACLValidator. I would propose to use a more "graph style" vocabulary. Maybe coverage, cause the result would be the coverage of all triples when traversing the graph according to all SHACL nodes.

b) Add a coverage property to the ValidationResult. The validate method of the SHACLValidator returns a ValidationReport. The report contains multiple ValidationResult in the results property. That would even allow us to distinguish between different results for the same shape. I'm not sure if there are any drawbacks. But if there are drawbacks, a second argument for options could be added to the validate method to enable this feature.

I would prefer option b.

@martinmaillard you know the library the best. What's your opinion on that topic?

tpluscode commented 3 years ago

From the perspective of SHACL I think the key step is determining the Focus Nodes which conform shapes. Once you know that it is simple to extract the subgraph by walking the Property Shapes.

For starters, I would propose adding only the first step to the library. In my mind that would be a TermMap<Term, Term[]> of shapes and conforming nodes (or the other way round, don't know which makes most sense).

type FocusNode = Term
type Shape = Term

interface ValidationResult {
  // for every conforming FocusNode return the matched Shapes
  conformingFocusNodes: TermMap<FocusNode, Shape[]>
}
pietercolpaert commented 3 years ago

I like the ideas. Some thoughts:

martinmaillard commented 3 years ago

This seems like a very useful feature: I actually implemented something somewhat similar in a project. But I wonder if it really belongs in this library.

ktk commented 3 years ago

To give some more context, this issue started on Twitter and @bergos proposed to do an issue here. But I also was asking myself if the library is the right place to do it. As in we should reflect that before it becomes a "stable" feature in here. Maybe putting it somewhere separate is a good idea.

tpluscode commented 3 years ago

I'm re-reading the comments. Is it actually something different from what I proposed above?

pietercolpaert commented 3 years ago

Don’t think so! I’m just a bit puzzled about who should do what now!