w3c / shacl

SHACL Community Group (Post-REC activitities)
27 stars 4 forks source link

Add a Constraint Component for uniqueness among values reached by a SHACL Path #38

Open mgberg opened 4 months ago

mgberg commented 4 months ago

I propose the addition of a (Boolean-valued?) constraint component that would create a validation error for any repeated value among those obtained from evaluating a property path. This would presumably only be meaningful if the path was not just a single predicate or an inverse path of a single predicate. This could be used to help detect irregularities in tree-like structures.

For example, consider a genealogy graph. Suppose one wanted to create a constraint that verified a person's parents did not share a parent (i.e., that a person does not have a grandparent via more than one way). Also, it would make sense to add this constraint component to the property shape for the path ex:father|ex:mother given as an example in the current SHACL spec to verify a person was not both the mother and father of a person.

I realize that this behavior can be easily accomplished via a SPARQL Constraint or Constraint Component, but it would presumably be more convenient and efficient if it was a built-in capability. Also, I suppose this would only work if the SHACL validator returned the full list of values reached when evaluating a property path instead of the set of unique values- I'm not sure how specific implementations work in this respect.

HolgerKnublauch commented 4 months ago

This may be difficult to implement. SHACL engines will usually rely on existing SPARQL property path engines, which will already eliminate duplicates before they can be processed.

mgberg commented 4 months ago

Yeah I figured that was likely.

TallTed commented 4 months ago

[@mgberg] For example, consider a genealogy graph. Suppose one wanted to create a constraint that verified a person's parents did not share a parent (i.e., that a person does not have a grandparent via more than one way).

It seems worth cautioning you (and later readers) against making assumptions which have demonstrable contradictions in real world data. Analysis of several sub-sects of the Mormons, for instance, will reveal a great many persons who have one grandparent (or great-grandparent, or further) via multiple paths.

mgberg commented 4 months ago

[@mgberg] For example, consider a genealogy graph. Suppose one wanted to create a constraint that verified a person's parents did not share a parent (i.e., that a person does not have a grandparent via more than one way).

It seems worth cautioning you (and later readers) against making assumptions which have demonstrable contradictions in real world data. Analysis of several sub-sects of the Mormons, for instance, will reveal a great many persons who have one grandparent (or great-grandparent, or further) via multiple paths.

Well sure, I'm just using this as an example to communicate the idea.