Closed gkellogg closed 5 years ago
I do not really follow this. The two TriG snippets are different through the reuse of a bnode.
The whole area of RDF Datasets semantics is a bit murky, and there was never a full consensus on the details. But that section clearly says, for example, that:
The graphs in a single dataset may share blank nodes.
ie, a single graph does not constitute some sort of a separate "namespace" (for the lack of a better word).
This is. It really about dataset semantics, but a practical solution to a narrow use case. Through framing, JSON-LD encourages that graph names be the object of some triple in another (usually defaultj graph. Not RDF semantics, but a JSON-LD bias. Verifiable Claims. Ames use of such a representation. When liking for how to find a shape that matches the content of a named graph, where the subject is a blank node, it helps if it is the same blank node used to name the graph. This is really something coming from @ericprud, and maybe he can comment further.
I do not really understand the issue, maybe indeed @ericprud can tell further but, regardless, I do not think it is a good idea if the expansion algorithm generates a semantically different RDF dataset than what the original syntax defines. My feeling is that this is exactly what would happen here.
I think we're defining the syntax now so we can't really deviate from "what the original syntax defines." I am, however, keenly interested in making sure the RDF graph is as closely aligned with the JSON tree as possible. Consider JS accessing the data in (a superset of) Gregg's example:
V = {
"input": {
"bar": "a",
"value": "x",
"baz": { "value": "y" }
}
}
The JSON tree provides direct access to "x"
via the path V.input.value
.
Finding the same info in the RDF graph implied by the @graph
requires both the ability to search the graph and prescient knowledge of which bnodes were generated for the "input":{}
object:
_:b0 foo:input _:b1 .
_:b1 {
_:b2 foo:bar "a" .
_:b2 foo:value "x" .
_:b2 foo:baz _:b3 .
_:b3 foo:value "y" .
}
There's no way to know that input's value is "x" and not "y" without presuming that JSON-LD only does forward arcs (now and in perpetuity) and doing an exhaustive search for the triple with a foo:value
predicate and no incoming arcs.
Doing so would not only make the RDF graph much less attractive to work with, it would paint future versions of JSON-LD into a corner.
If, however, the graph node were used as the subject:
_:b0 foo:input _:b1 .
_:b1 {
_:b1 foo:bar "a" .
_:b1 foo:value "x" .
_:b1 foo:baz _:b2 .
_:b2 foo:value "y" .
}
the connection from foo:input
to "x"
would be easy to navigate in e.g. SPARQL:
SELECT ?value WHERE {
[] foo:input ?g
GRAPH ?g { ?g foo:value ?value }
}
I'm not particularly in love with the idea of _:b1
having dual use as a node and a graph name but it seems less controversial than the alternatives, all of which involve creating extra triples:
_:b0 foo:input _:b1 . _:b0 rdf:rootNode _:b2 . # <-- root connector _:b1 { _:b2 foo:bar "a" . _:b2 foo:value "x" . _:b2 foo:baz _:b3 . _:b3 foo:value "y" . }
_:b0 foo:input _:b1 . _:b1 { _:b2 rdf:nodeRole rdf:TreeRoot . # <-- root annotation _:b2 foo:bar "a" . _:b2 foo:value "x" . _:b2 foo:baz _:b3 . _:b3 foo:value "y" . }
When we discuss this, I think it would be important to have @dlongley (and/or @msporny) and @ericprud on the call, as they are most informed about the use case.
I do not think that @ericprud's comment answered my concern:
I do not think it is a good idea if the expansion algorithm generates a semantically different RDF dataset than what the original syntax defines. My feeling is that this is exactly what would happen here.
I think you're arguing for the proposal then. Let's examine this without the , "@container": "@graph"
:
{
"@context": {
"@version": 1.1,
"input": {"@id": "foo:input"},
"value": "foo:value"
},
"input": {
"value": "x"
}
}
yields
_:b0 foo:input _:b1 .
_:b1 foo:value "x" .
I.e. the object of foo:input
is the subject of foo:value
. Without this proposal, they are completely disconnected in the graph implied by "@container": "@graph"
:
_:b0 foo:input _:b1 .
_:b2 foo:value "x" _:b1 .
To me, that seems like "semantically different RDF" -- what was once navigable by a path now requires a heuristic search (something involving finding a node with no incoming arcs, unless you have inverse properties in the @context
, in which case, good luck). With this proposal, they are once again connected, even though the 2nd triple is in another graph:
_:b0 foo:input _:b1 .
_:b1 foo:value "x" _:b1 .
@iherman There is no conflict with 1.0, as there was no notion of implicitly defined named graphs. The "@container": "@graph"
creates this possibility 1.1 to make connected JSON work and have meaning as RDF.
The fundamental issue with shapes, is if you can identify the graph name as it is the object of a triple in the default graph, that says nothing about any shape contained within that graph.
By re-purposing the graph name as the default subject of that named graph we run the risk of conflating the meaning of that identifier: does it name a graph or does it name a subject within the graph. But, for practical purposes, it makes sense to do this to be able to follow a chain from the default graph, through an identified graph name, and to a subject within that graph.
The alternative for ShEx would be to use some other properties of that graph to hook up the shape, for example, find the subject within that graph which is not also an object in that graph, but this could be convoluted and expensive for large graphs, and does not take into consideration the possibility of reverse properties used within the JSON-LD serialization.
In summary, the proposal solves a problem that exists in the real world at the expense of some blank node identifier semantics.
I am sorry, maybe I am rusty with my RDF. In my reading,
{
"@context": {
"@version": 1.1,
"input": {"@id": "foo:input", "@container": "@graph"},
"value": "foo:value"
},
"input": {
"value": "x"
}
}
Must yield:
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
[ <foo:input> _:b1] .
_:b1 {
[ <foo:value> "x"] .
}
or, to make it into n-Quads
_:b0 <foo:input> _:b1 .
_:b2 <foo:value> "x" _:b1 .
This is what happens today. The object foo:input
is a graph (that is what "@container": "@graph"
means after all, right?) and this does not mean it is (also) the subject of a triple within that graph. These are two different things.
In other words,
_:b0 <foo:input> _:b1 .
_:b1 <foo:value> "x" _:b1 .
is, imho, plainly wrong.
We seem to be in an impass; maybe we should try to ask the opinion of another RDF expert...
You're correct that that's what the spec says now, the proposal is to change this meaning. As this is entirely new behavior, there is no real compatibility issue. It is most useful if the default subject of the graph is the same as the graph name, for the purposes of shape matching, anyway, but in general traversing between graphs.
If I change this meaning I create a backward incompatible version of JSON-LD. Didn't we say that is a big no-no?
We can introduce a new type of @container
(I am not sure I would like this) which has an additional semantics. I am not sure how one would define that formally, and I think it would add to the general confusions...
JSON-LD 1.0 had no graph containers, so there is no backwards compatibility issue. In 1.0, all named graphs were explicit. The "@container": "@graph"
is something introduced in the CG work, basically to handle the Verifiable Claims issue.
Ah, m’y bad. sorry about that.
I still feel a bit uncomfortable, but less. It somehow more than a “simple” container: it is not the property refers to a graph, but it refers to a graph that has a special additional behavior. This is different than, say, a container to a list, whch is just that: refers to a list, without any further strings attached. These mini, baroque additional thingies may bite us later because it will contribute to the overall impression of a very complex language.
This proposal is specifically intended to address the increased complexity that arises when creating disconnected nodes in a graph (see the first TRIG example in my write-up above).
My guess is that some folks may want to motivate more controls in the @context
; what we're working on here is the default behavior in the absence of extra directives for synthesizing URLs when there's no @id
in the nested graph.
In Linked Data, there's a sharp divide between the purists who want to separate graph names from the nodes for which those graphs were created and pragmatists who don't see the need.
<P04637> rdf:type up:Protein
.In the pragmatist approach, <P04637> identifies both the page and the protein. In the purist approach, a trick of HTTP (that the fragment identifier is not passed in an HTTP request) allows us to map from the node <http://…MyFoafPage#me> to the page <http://…MyFoafPage> (though not the other way around).
Because there's no analogous trick for bnodes (implied by objects with no @id
), we are stuck with adding extra triples. Above, I described two approaches which synthesize extra URLs, one in the referring graph (root connector) and one in the referenced graph (root annotation). The root connector approach could have lots of forms. I mocked one up where the referring predicate got a sibling triple pointing to the root node:
_:b0 foo:input _:b1 . _:b0 rdf:rootNode _:b2 . # <-- root connector _:b1 { _:b2 foo:bar "a" … }
A perhaps more forward-thinking approach would be to create a structure for pairing roots with graphs:
_:b0 foo:input _:b1 . _:b1 rdf:graphReference _:b2 . # <-- identify the created graph _:b1 rdf:rootNode _:b3 . # <-- identify the node implied by the nested JSON object _:b2 { _:b3 foo:bar "a" … }
A pairing of a graph name and a root or focus node would be helpful in other contexts e.g. identifying the pair of created web resource and RDF node in the return from a POST to an ldp:Container. Addressing this would also stanch the "I can't believe you guys haven't already solved this" comments I hear in HL7 when trying to use RDF to represent clinical resources.
@ericprud Will you be at TPAC on the Thursday/Friday? We could dive into the details in person?
I believe I can be there Thu. And yeah, I guess this could benefit from some whiteboard time.
EXAMPLE 85: Implicitly named graph :
{
"@context": {
"@version": 1.1,
"generatedAt": {
"@id": "http://www.w3.org/ns/prov#generatedAtTime",
"@type": "http://www.w3.org/2001/XMLSchema#date"
},
"Person": "http://xmlns.com/foaf/0.1/Person",
"name": "http://xmlns.com/foaf/0.1/name",
"knows": {"@id": "http://xmlns.com/foaf/0.1/knows", "@type": "@id"},
"claim": {
"@id": "https://w3id.org/credentials#claim",
"@container": "@graph"
}
},
"@id": "http://example.org/foaf-graph",
"generatedAt": "2012-04-09",
"claim": [
{
"@id": "http://manu.sporny.org/about#manu",
"@type": "Person",
"name": "Manu Sporny",
"knows": "http://greggkellogg.net/foaf#me"
}, {
"@id": "http://greggkellogg.net/foaf#me",
"@type": "Person",
"name": "Gregg Kellogg",
"knows": "http://manu.sporny.org/about#manu"
}
]
}
provides following expanded version of itself:
[{
"@id": "http://example.org/foaf-graph",
"http://www.w3.org/ns/prov#generatedAtTime": [{
"@value": "2012-04-09",
"@type": "http://www.w3.org/2001/XMLSchema#date"
}],
"https://w3id.org/credentials#claim": [{
"@graph": [{
"@id": "http://manu.sporny.org/about#manu",
"@type": ["http://xmlns.com/foaf/0.1/Person"],
"http://xmlns.com/foaf/0.1/name": [{"@value": "Manu Sporny"}],
"http://xmlns.com/foaf/0.1/knows": [
{"@id": "http://greggkellogg.net/foaf#me"}
]}
]
}, {
"@graph": [{
"@id": "http://greggkellogg.net/foaf#me",
"@type": ["http://xmlns.com/foaf/0.1/Person"],
"http://xmlns.com/foaf/0.1/name": [{"@value": "Gregg Kellogg"}],
"http://xmlns.com/foaf/0.1/knows": [
{"@id": "http://manu.sporny.org/about#manu"}
]
}]
}]
}]
and consequently shows two implicitly named graphs _:b0 and _:b1
Graph | Subject | Property | Value | Value Type |
---|---|---|---|---|
http://example.org/foaf-graph | prov:generatedAtTime | 2012-04-09 | xsd:date | |
http://example.org/foaf-graph | https://w3id.org/credentials#claim | _:b0 | ||
http://example.org/foaf-graph | https://w3id.org/credentials#claim | _:b1 | ||
_:b0 | http://manu.sporny.org/about#manu | rdf:type | foaf:Person | |
_:b0 | http://manu.sporny.org/about#manu | foaf:name | Manu Sporny | |
_:b0 | http://manu.sporny.org/about#manu | foaf:knows | http://greggkellogg.net/foaf#me | |
_:b1 | http://greggkellogg.net/foaf#me | rdf:type | foaf:Person | |
_:b1 | http://greggkellogg.net/foaf#me | foaf:name | Gregg Kellogg | |
_:b1 | http://greggkellogg.net/foaf#me | foaf:knows | http://manu.sporny.org/about#manu |
however, in the playground the expanded version of example 85 is given as:
[
{
"@id": "http://example.org/foaf-graph",
"https://w3id.org/credentials#claim": [
{
"@graph": [
{
"@id": "http://manu.sporny.org/about#manu",
"@type": [
"http://xmlns.com/foaf/0.1/Person"
],
"http://xmlns.com/foaf/0.1/knows": [
{
"@id": "http://greggkellogg.net/foaf#me"
}
],
"http://xmlns.com/foaf/0.1/name": [
{
"@value": "Manu Sporny"
}
]
},
{
"@id": "http://greggkellogg.net/foaf#me",
"@type": [
"http://xmlns.com/foaf/0.1/Person"
],
"http://xmlns.com/foaf/0.1/knows": [
{
"@id": "http://manu.sporny.org/about#manu"
}
],
"http://xmlns.com/foaf/0.1/name": [
{
"@value": "Gregg Kellogg"
}
]
}
]
}
],
"http://www.w3.org/ns/prov#generatedAtTime": [
{
"@type": "http://www.w3.org/2001/XMLSchema#date",
"@value": "2012-04-09"
}
]
}
]
and subsequently following nquads with one implicit graph _:b0 only:
<http://example.org/foaf-graph> <http://www.w3.org/ns/prov#generatedAtTime> "2012-04-09"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://example.org/foaf-graph> <https://w3id.org/credentials#claim> _:b0 .
<http://greggkellogg.net/foaf#me> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> _:b0 .
<http://greggkellogg.net/foaf#me> <http://xmlns.com/foaf/0.1/knows> <http://manu.sporny.org/about#manu> _:b0 .
<http://greggkellogg.net/foaf#me> <http://xmlns.com/foaf/0.1/name> "Gregg Kellogg" _:b0 .
<http://manu.sporny.org/about#manu> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> _:b0 .
<http://manu.sporny.org/about#manu> <http://xmlns.com/foaf/0.1/knows> <http://greggkellogg.net/foaf#me> _:b0 .
<http://manu.sporny.org/about#manu> <http://xmlns.com/foaf/0.1/name> "Manu Sporny" _:b0 .
Why is the playground not producing the same result as given in the spec? or am I missing something here and they are actually equivalent?
This issue was discussed in a meeting.
RESOLVED: add a feature at risk that the implicitly identified graphs will share the bnode with the unidentified member of the graph, on the grounds that the user community most in need of this would expect it, and the community that would be horrified by it better understands the solution of explicit naming
I'm a little lost on the conclusion here...
Currently, this (http://tinyurl.com/y7zoyjk2):
{
"@context": {
"@version": 1.1,
"claim": {"@id": "ex:claim", "@container": "@graph"},
"name": "ex:name"
},
"claim": {
"@id": "ex:subject",
"name": "A subject"
}
}
Yields these quads:
<ex:subject> <ex:name> "A subject" _:b1 .
_:b0 <ex:claim> _:b1 .
Would this change with this proposal? If so, how?
I'm concerned that there may be a serious issue that breaks the encapsulation properties we need for Verifiable Credentials.
No, it wouldn't change the quads in this case. If the claim didn't have an @id
, it would reuse that of the graph. The reasoning is that this will make it easier to follow through the graph to the default subject for shape matching purposes. If the document were the following:
{
"@context": {
"@version": 1.1,
"claim": {"@id": "ex:claim", "@container": "@graph"},
"name": "ex:name"
},
"claim": {
"name": "A subject"
}
}
Then you'd see something like:
_:b0 <ex:claim> _:b1 .
_:b1 <ex:name> "A subject" _b1 .
@gkellogg,
Oh! That's much less scary than I thought. I think that's ok, but would love for others who have any experience with the VC work to give their opinions. Once we've modeled more of the ZKP style approach to VCs (where the main "subject" of a VC may not have an @id
) we may have more input.
The discussion on Framing blank node unnamed graphs was actually about w3c/json-ld-syntax#26. w3c/json-ld-framing#27 is really about framing anonymous named graphs, which we didn’t discuss.
Since this was the body of the discussion, I’d just suggest changing the title for 5.11 to "Ensure that blank node identifiers for anonymous graphs are reused”, and reference w3c/json-ld-syntax#26. instead, but we probably need to agree to this on next Friday’s call.
This issue was discussed in a meeting.
RESOLVED: close syntax#27 wontfix, as there’s no justification for the required RDF layer requirement that the blank node identity of the named graph is the default subject of the triples in the graph {: #resolution15 .resolution}
"no justification"‽ Do I have to go back over the arguments that convinced everyone in the room except Ivan during the F2F?
Really, it just came down to the stink test. Overloading the use of a blank node name as the graph name and the default subject was generally regarded as being semantically incorrect, even if useful. Authors will need to find another way, such as a well-known property value, or use fragment identifiers.
Hi @ericprud,
To clarify the resolution, it was not that the use case was considered to be invalid, and was even generally agreed to be useful as @gkellogg says! However it was considered that the JSON-LD group did not have the justification to make a significant assertion about the use of named graphs and blank nodes, such that it became a de facto semantic model requirement that isn't in RDF 1.1. Given our charter (that says we will kick RDF problems up to a larger group), we couldn't justify making some normative requirement in this space, especially as the issue goes away if a URI is used, or if a property is added for an application to find the top node of the named graph.
I spoke to the director about my concerns regarding the disconnectedness of the graph. He urged us to pursue a solution along the lines of having an extra triple which indicates the node in the unnamed graph which corresponds to the nested JSON tree, i.e. root connector
proposal above. He said we could use a JSON-LD namespace for the connector property and consider moving it to the RDF namespace once the technical issues were resolved.
@ericprud, presuming your comment refers to the conversation you and I had last week then clarification is needed. You raised a concern with @plehegar on how the Working Group had handled this issue. PLH asked me to take a look and see if, in our delegation from TimBL to handle Transition Requests, we might find a path that would save TimBL's time later.
You and I spoke about the evolution of this thread following the Lyon f2f; including whether it was clear to you what actions could result in the removal of the "At Risk" qualification.
I spoke to the director about my concerns regarding the disconnectedness of the graph. He urged us to pursue a solution
As you acknowledged that you had provided little (I heard you say "no") activity on addressing the At Risk concerns since Lyon, I suggested (ok; "urged") that you consider a solution that applications in your use case(s) could employ now without changing the spec, or that the WG might be more comfortable accepting as an interim for the next Recommendation. I noted that the Working Group has a schedule that it is expected to meet and it has the responsibility to triage its issues accordingly.
along the lines of having an extra triple which indicates the node in the unnamed graph which corresponds to the nested JSON tree, i.e.
root connector
proposal above. He said we could use a JSON-LD namespace for the connector property and consider moving it to the RDF namespace once the technical issues were resolved.
I said that if the Working Group decides to consider that alternative approach further and reaches consensus on including an experimental/interim approach in the spec with such a triple and their remaining concern was which namespace to use, that IMHO there could be flexibility on choice of namespace.
From https://github.com/w3c/json-ld-syntax/issues/30#issuecomment-409994489, @ericprud notes the problem with using ShEx, or anything else, to match the content of a named graph with only blank node subjects. Consider the following JSON-LD (from expansion test 0079):
Currently, this will generate TriG similar to the following:
and expanded JSON-LD:
Following the link from _:b1 as an object to the graph using that name is feasible, but finding an unnamed subject within that graph can't really be done, for any reasonably complex named graph.
This proposal would cause the expansion algorithm to re-use the blank-node identifier naming the graph for the implicitly named subject contained within the graph, generating the following TriG:
This makes it possible to follow the chain from the object identifying the graph to the primary subject of that graph. Provisions must be made for forms in which there are multiple unnamed subjects within the named graph.