shexSpec / shex

ShEx language issues, including new features for e.g. ShEx2.1
24 stars 8 forks source link

semantics of recursive shapes ill-founded #84

Open pfps opened 6 years ago

pfps commented 6 years ago

There is a fundamental problem in the semantics of ShEx 2.0.

Consider

S = :a { :p @:a } G = ( :x :p :x . } m = { ( :x :a ) }

When trying to determine whether m is valid for S and G the semantics gets into an infinite loop trying to determine the value of

satisfies(:x, @:a, G, m)

ericprud commented 6 years ago

The semantics document is missing the assertion that direct self-referential shape expressions are prohibited. It needs text like:

No shape expression with id=S may be a shapeExpr reference to S or a ShapeAnd, ShapeOr, or ShapeNot which transitively references S.

to bring it into line with the test suite and implementations. (Note that this text should be simpler when we use JSON-LD 1.1 Id Maps).

Given that, it makes sense to change pfps's example above to be:

S: <S1> { <p> @<S2> }. <S2> { <p> @<S1> } 
G: <n1> <p> <n2> . <n2> <p> <n1> .
m: <n1>@<S1>

I believe Peter's assertion to be that the spec doesn't say we should choose the maximal typing (the test suite assumes it).

These tests recursively validate some node/shape association (should I add a recursive-data trait?):

pfps commented 6 years ago

I don't see why my example is a case of a shape referring to itself without going through a triple. Your example appears to me to have the same kind of self-reference, just with two steps instead of one.

pfps commented 6 years ago

I looked at https://github.com/shexSpec/shexTest/blob/master/validation/manifest.ttl#L4061 and it appears that this test does have the feature that the validation of a shape gets into an infinite loop in the semantics as written. So this test is not correct with respect to the semantics as written.

ericprud commented 6 years ago

I stand corrected, your example is sufficient. It might, however, lead to more rigorous understanding to have a less direct recursion.

ericprud commented 5 years ago

It is our belief that the semantics in ShEx 2.1 address this. Please close this issue if you agree.

pfps commented 5 years ago

This is a very different semantics from the previous semantics for ShEx. It is not easy to tease out all the aspects of the semantics, as information about semantics is spread throughout the document. The following is my interpretation of what is going on, but I have been unable to get my head around the semantics in its entirety. My view is that it would have been better to have a separate section on semantics with an introductory bit saying how the semantics unfolds.

The first mention of isValid is in Section 2.5 where it is stated that the function takes a shapes schema, an RDF graph, and a fixed ShapeMap and returns a result ShapeMap.

Section 3 of ShapeMap Structure and Language Draft Community Group Report 13 July 2017 at http://shex.io/shape-map/#dfn-fixed-shapemap defines ShapeMaps. Here the status is one of two strings "conformant" and "nonconformant". However, in Section 2.5 of Shape Expressions Language 2.1 Draft Community Group Report 17 November 2018 at http://shex.io/shex-semantics/#process the result ShapeMap does not have a status but instead a result with values pass and fail. This disconnect needs to be fixed.

ShapeMaps cannot have two shape associations with the same nodeSelector and shapeLabel. But the notion of identity for shape associations is not explicitly defined. However, the second half of Example 1 indicates that shape associations have an identity independent of their membership. This means that the process of converting a query ShapeMap to a fixed ShapeMap defined in Section 4 of ShapeMap Structure and Language Draft Community Group Report 13 July 2017 at http://shex.io/shape-map/#dfn-fixed-shapemap might not produce a valid ShapeMap. This needs to be fixed.

Moving on, the definition of isValid is in Section 5.2. There are several problems in the definition. First, here the third argument is an "input shape map". Is this different from "fixed ShapeMap". Second, although strongly connected is a notion generally known in graphy theory circles there should be a definition of it in this document, or at least a pointer to a definition. Third, there is the unsupported claim that different stratifications lead to the same unique complete typing. This is a fundamental part of the semantics here and needs to be demonstrated.

The second note in Section 5.2 is puzzling. isValid takes a input ShapeMap. So then why is there a note that the semantics of ShEx are indpendent of the construction of the input ShapeMap?

There is no requirement that I can see that ShapeMaps be finite. What happens to stratification in a ShapeMap with an infinite number of strata?

Without clarification or correction of these points I cannot say for certain whether this ShEx semantics is well-founded.