w3c / data-shapes

RDF Data Shapes WG repo
87 stars 33 forks source link

disjoint-001.ttl in the SHACL Test Suite seems to violate the spec #149

Closed cem-okulmus closed 8 months ago

cem-okulmus commented 8 months ago

The test /core/node/disjoint-001.ttl uses the constraint "sh:disjoint" at a NodeShape. This is property pair constraint and according to the SHACL standard "[these] constraint components can only be used by property shapes". Thus we can assume that there is a "sh:path" to test the constraint with.

Indeed, given the definition of sh:disjoint:

TEXTUAL DEFINITION For each value node that also exists as a value of the property $disjoint at the focus node, there is a validation result with the value node as sh:value.

For node shapes, the (singular) value node is also the focus node, so I have no idea how to make any sense of sh:disjoint for node shapes (which by def. have no sh:path). It is not at all clear to me how we get to the validation report given in this test. Am I missing something?

EDIT: I have similar questions about the test /core/node/equals-001.ttl which again uses a property pair constraint at a node shape. Where the tests written with an older version of the spec in mind?

simonstey commented 8 months ago

as you pointed out:

For node shapes, the (singular) value node is also the focus node

hence, ex:InvalidResource1 ex:property ex:InvalidResource1 . is invalid.

one could e.g. use a shape like

ex:TestShape
  rdf:type sh:NodeShape ;
  rdfs:label "Test shape" ;
  sh:disjoint ex:property ;
  sh:targetNode ex:InvalidResource1 ;
  sh:targetNode ex:ValidResource1 ;
.

to check that none of the focus nodes has itself as the value of property ex:property.

cem-okulmus commented 8 months ago

So the use of Property Pair constraints at NodeShape is something that shacl validators should anticipate? The standard seems pretty clear that this should not happen, so that's my main source of confusion. If that sentence wasn't there (the one I quoted) I wouldn't have made this issue

simonstey commented 8 months ago

I see your point now..

4.5 Property Pair Constraint Components The constraint components in this section specify conditions on the sets of value nodes in relation to other properties. These constraint components can only be used by property shapes.

is indeed inconsistent with the test case.. 🤔

So the use of Property Pair constraints at NodeShape is something that shacl validators should anticipate?

https://s.zazuko.com/2xessMP for example does

maybe @HolgerKnublauch could chime in here

HolgerKnublauch commented 8 months ago

We did discuss this topic in the WG and agreed to disagree. Some folks wanted each constraint component to work in both node and property shapes modes. My personal opinion was that not all combinations make sense. I guess this particular case is reasonably sensible as a way to express irreflexive properties. But overall this editorial mismatch is a result of an inconclusive discussion. I am not sure what to do at this stage, as this may be rather a topic of a next-gen SHACL 1.1 WG than us making random choices here.

cem-okulmus commented 8 months ago

I think that answers my question then. I would close this then as "resolved" then

TallTed commented 7 months ago

@HolgerKnublauch — It may be worth creating some labels in this repo and/or transferring issues like this "topic of a next-gen SHACL 1.1 WG" to the current SHACL CG's repo, such that threads like this can be tagged, and when we recognize a sufficiently substantial number of "topics of a next-gen SHACL 1.1 WG", we can start drafting a charter for that WG....

HolgerKnublauch commented 7 months ago

Hi @TallTed - nice to hear from you. As someone who is in the RDF-star working group, what is your impression about the potential for a new SHACL WG and the chances of getting through such a WG in under two years of intensive work? With RDF-star it seems like there was a well-worked out proposal on the table yet now the group is back to discussing various fundamentally different alternatives.

TallTed commented 7 months ago

Hello, @HolgerKnublauch —

WGs are not meant to rubber-stamp input from any source, CG or otherwise. CG reports are just like any other input to the work of a WG, and the eventual output of any WG might be completely different from any or all such inputs.

The RDF-star focus group of the RDF-DEV CG did indeed do a lot of work, but the RDF-star WG has different membership and has reached different conclusions about how to address some aspects of the needs we see, as well as the relative priorities of these needs. All of this is normal and to be expected.

It's also worth noting that the RDF-star WG's charter includes fixing errata and updating nearly every spec that RDF or SPARQL touches, bringing them all in line with each other — so nothing we touch remains based on RDF 1.0, RDF 1.1, SPARQL 1.0, nor SPARQL 1.1. Rather, everything we publish is intended to be aligned as RDF 1.2 and SPARQL 1.2 in the twenty-ish documents we'll put out.

For a prospective Data-Shapes-Next WG, on the other hand, we'll only have the SHACL 1.0 (and maybe the ShEx 1.0) document(s) to update, which should be substantially less work, and with substantially fewer external impacts or inputs to consider. Delivering a SHACL and/or ShEx 1.1 or 2.0 (depending on the net changes made, in the end) within two years seems quite possible.

HolgerKnublauch commented 7 months ago

Thank you, Ted. Overall it feels like W3C's idea of Community Groups doesn't work at all then, if WGs again start with basically a clean slate and the CG drafts have no precedence over any other input. I guess the "art" lies in setting up the charter in a way such that no other alternative can be considered, and try to get the charter approved. Anything else sounds like a time sink.

afs commented 7 months ago

I think @TallTed's point is that the CG input isn't prescriptive. Named inputs must be considered by the WG. Other work has had a CG-WG-REC flow.

It used to be "W3C Submissions" would form the basis for WG work. They did have a limited representation and all the consequences of that.

It shows that getting parties involved in the CG is beneficial both to get a wider set of opinions (it used to be WG UCR) and for continuity of involvement into WG.

Time sinks are a big worry (XQuery!?!?).

TallTed commented 7 months ago

[@HolgerKnublauch] Thank you, Ted. Overall it feels like W3C's idea of Community Groups doesn't work at all then, if WGs again start with basically a clean slate and the CG drafts have no precedence over any other input.

CG Reports can be extremely helpful to WGs that follow, especially if the bases for significant recommendations in such reports are clearly discoverable in the CG's repo, mailing list, etc. Much like Member Submissions, CG Reports are not binding on following WGs, but they are not ignored either.

I guess the "art" lies in setting up the charter in a way such that no other alternative can be considered, and try to get the charter approved. Anything else sounds like a time sink.

What you describe would be considered "gaming the system" and rather extremely frowned upon, because the WG participants should be free to take all inputs into consideration, regardless of whether some of those inputs came to the CG, through the CG, or directly to the WG. It is entirely possible that some CG work escaped notice by individuals or organizations that bring valuable knowledge and/or experience to a WG, especially if those organizations were not yet W3C Members. Indeed, WG inception can be a driving force in motivating organizations to become Members, and/or individuals to request Invited Expert status.

It seems obvious, but perhaps it's worth saying that WGs cannot have predetermined results. If the results were that obvious, there would be no need for a standards development organization at all. Similarly, if a participant's only acceptable conclusion was that their input was taken as given — that no other input could be cause to change the output — they would not be conforming with the principles of consensus which govern all W3C activities.

I do not think that WGs or CGs that produce no TR or similar are wastes of time. Cross-pollination is a necessary piece of standards development, because it is exceedingly rare that a single person or organization fully understands some aspect of web tech. There is virtually always something that a different person or organization experiences differently, which experience results in a necessary modification of the final TR — which then works for a significantly greater proportion of the web citizenry.

HolgerKnublauch commented 7 months ago

Hi Ted, I fully agree that diverse inputs will likely create better outcomes. In the case of the Data Shapes WG that I participated myself in, I believe the outcome was a much more complete and better language than any of the three input languages. There were many good ideas that I didn't have for sure. I am glad we allowed complex sh:paths in, for example, even though I initially believed this would make the language too complex.

While the notion of consensus-driven work is a good idea in general, the W3C processes do not implement that ideal. Instead of group consensus, individual members have vetoing rights (I know, I used that myself sometimes). And even when everything is ready to be signed off, individuals can file formal objections that may further derail or delay a standard (yes this happened with SHACL too). Given that such WGs often only have 10 active members, each individual has enormous power. I remember at least three very critical votes that would have fundamentally altered the outcome of the Data Shapes WG if a single person had voted differently. In the end such discussions can end up being more a game of political alliances than based on technical evidence. And it is a complete coincidence of who is personally willing to invest the time. Quite often the big vendors are not able to spend these resources. And that is even though these vendors may have many years of valuable practical experience with real-world customers.

The process is already prone to "gaming the system". Before a charter is approved, anyone already has plenty of opportunties to comment or raise concerns. By setting a reasonably predictable charter, at least there is a chance for organizations to judge whether their time is well spent and actually encourage the commercial vendors to participate too.

TallTed commented 7 months ago

Instead of group consensus, individual members have vetoing rights

This is part of the process of reaching consensus. It's rare that any participant gets everything they want in exactly the way they want it, but every participant must be able to live with the end result. The "vetoing rights" are one of the ways any given participant can make known that they cannot live with this aspect of the current discussion ... which should and generally does lead to further conversation, and a modification with which everyone can live.

Which is not to say that the process is perfect. It continues to evolve, as does the organization itself, and all its members.