shexSpec / shex

ShEx language issues, including new features for e.g. ShEx2.1
25 stars 8 forks source link

Definition of NodeConstraint in ShExR.shex #34

Closed labra closed 7 years ago

labra commented 7 years ago

I am comparing the definition of NodeConstraint in ShExR.jsg

NodeConstraint { nodeKind:("iri"|"bnode"|"nonliteral"|"literal")? datatype:IRI? xsFacet* values:[valueSetValue+]? }

with the definition of NodeConstraint in ShExR.shex:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ;
  (  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]
   | sx:datatype IRI
#   | &<xsFacet>
   | &<stringFacet>
   | &<numericFacet>
   | sx:values @<valueSetValueList1Plus>)+
}

and I wonder if the plus wrapping the different possibilities is right. I would model it in ShEx as:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ; 
  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]?;
  sx:datatype IRI ? ;
  &<xsFacet>*  ;
  sx:values @<valueSetValueList1Plus>? 
}

Is there any reason for using the first model instead of the second?

ericprud commented 7 years ago

IMO, ShExR.shex gives a tighter definition. This is analogous to other relatively conservative constraints like that ShapeOr and ShapeAnd are composed of two or more shapeExpressions or that OneOf and AllOf are composed of two or more tripleExpressions. I think these are useful because they:

We can always loosen restrictions like this in future versions while maintaining backward compatibility.

labra commented 7 years ago

yes, but what intrigues me is the "plus" meaning one or more node constraints. I had understood the line:

NodeConstraint { nodeKind:("iri"|"bnode"|"nonliteral"|"literal")? datatype:IRI? xsFacet* values:[valueSetValue+]? }

to be internally represented as an object with 4 possible components (nodeKind, datatype, xsFacet, valueSetValue) but with the "plus", the following would also be a node constraint:

:n a sx:NodeConstraint ;
  sx:nodeKind sx:iri, sx:bnode .

Is that allowed? Do you internally represent node constraints as an array of node constraint objects?

ericprud commented 7 years ago

fair point. +1 to your proposal (modulo adding a ? to sx:values to reflect values:[valueSetValue+]?). We could use a more restrictive

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ;
  ( sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral];
    sx:datatype IRI ? ;
    &<xsFacet>* ;
    sx:values @<valueSetValueList1Plus>? )
| ( sx:datatype IRI ;
    &<xsFacet>*  ;
    sx:values @<valueSetValueList1Plus>? )
| ( &<xsFacet>+  ;
    sx:values @<valueSetValueList1Plus>? )
| ( sx:values @<valueSetValueList1Plus> )
}

but it probably isn't worth the noise.

ericprud commented 7 years ago

Just for completeness (obsessiveness), if we did want to be more restrictive, we could also write it:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ; 
  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]?;
  sx:datatype IRI ? ;
  &<xsFacet>*  ;
  sx:values @<valueSetValueList1Plus>?
} AND CLOSED {
  a [sx:NodeConstraint] ;
  (  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]
   | sx:datatype IRI
   | &<stringFacet>
   | &<numericFacet>
   | sx:values @<valueSetValueList1Plus>)+
}
labra commented 7 years ago

Could be although it breaks the DRY principle.

It looks like there is some hidden pattern trying to emerge. Maybe for a future version of ShEx where we could have macros or templates...by now, I would prefer to the less obsessive but more readable definition.

I updated my definition adding the ? to sx:values

labra commented 7 years ago

I created this pull request which has already been merged.

labra commented 7 years ago

I reopen the issue because I noticed that my definition above:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ; 
  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]?;
  sx:datatype IRI ? ;
  &<xsFacet>*  ;
  sx:values @<valueSetValueList1Plus>? 
}

doesn't work because the line &<xsFacet>* contains a cardinality over an inclusion and they are not allowed in the grammar.

[40]    unaryTripleExpr       ::= productionLabel? (tripleConstraint | bracketedTripleExpr) | include
[41] bracketedTripleExpr ::= '(' innerTripleExpr ')' cardinality? annotation* semanticActions

...
[50] include ::= '&' shapeLabel

Maybe, moving the include from rule 40 to rule 41 would allow this possibility:

[41] bracketedTripleExpr ::= '(' innerTripleExpr | include ')' cardinality? annotation* semanticActions

Or just adding an optional cardinality to rule 50 could solve this.

Is this feasible or is there some reason to forbid cardinalities over includes?

labra commented 7 years ago

I have created issue #35 to discuss what to do about cardinalities and includes...

gkellogg commented 7 years ago

Until this is resolved, ShExR.shex is broken, as it does include a cardinality on Inclusion:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ; 
  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]?;
  sx:datatype IRI ? ;
  &<xsFacet>*  ;
  sx:values @<valueSetValueList1Plus>? 
}

Personally, I don't think it is consistent with the RDF interpretation of ShEx to have cardinality here, as it references an expression which may have cardinality.

labra commented 7 years ago

I have created this pull request to repair ShExR.shex

The new definition of NodeConstraint is:

<NodeConstraint> CLOSED {
  a [sx:NodeConstraint] ; 
  sx:nodeKind [sx:iri sx:bnode sx:literal sx:nonliteral]?;
  sx:datatype IRI ? ;
  &<xsFacets>  ;
  sx:values @<valueSetValueList1Plus>? 
}

where <xsFacets> is defined as:

 $<xsFacets> ( &<stringFacet> | &<numericFacet> )* ;
labra commented 7 years ago

I think I can close this issue safely as it was editorial.