Open miguel76 opened 5 years ago
In CONSTRUCT queries, there is often the need to generate new resources that are referenced across multiple solution mappings. There is no way to generate blank nodes having [this role].
Is that so? In the majority of implementations I've tried, it seems that this is possible by placing a no-argument bNode()
call at the right point in the query. For example, a BIND (bNode() AS ?root)
clause right at the start of the query pattern will produce a single unique blank node that is shared across all solutions.
That being said, from reading the spec I don't understand what the bNode(xxx)
form with argument form is actually supposed to return. The spec says:
If the form with a simple literal is used, every call results in distinct blank nodes for different simple literals, and the same blank node for calls with the same simple literal within expressions for one solution mapping.
But that doesn't seem to be what's generally implemented.
SELECT ?a ?b {
{ BIND (bNode("x") AS ?a) }
{ BIND (bNode("x") AS ?b) }
}
The solution mapping for both calls is the same—the empty solution. Yet ?a
and ?b
are different in the majority of processors I tried.
Then we have:
SELECT ?a ?b {
BIND (bNode("x") AS ?a)
BIND (bNode("x") AS ?b)
}
The solution mapping is different for both calls—the empty solution for the first, and a solution mapping ?a
to a blank node for the second. Yet ?a
and ?b
are the same in the majority of processors I tried.
So at the very least, some clarification of the spec text might be needed.
after "... within expressions for one solution mapping," the next sentence says
This functionality is compatible with the treatment of blank nodes in SPARQL CONSTRUCT templates.
according to which, a better test case would be
SELECT ?a {
VALUES ?z { 'abc' 'def' }
{ BIND (bNode('x') AS ?a) }
}
that said, while a CONSTRUCT form provides a context within which to distinguish solution mappings, it is not clear how a processor is to distinguish them in general.
SELECT ?a ?b {
{ BIND (bNode("x") AS ?a) }
{ BIND (bNode("x") AS ?b) }
}
There are two solution mappings, one inside each {}
, which get joined (cross product) to form the third solution mapping that is the overall result. So there will be different blank nodes.
New solution mappings get made when join and other operations happen. In join, the "merge(μ1, μ2)" is a new solution mapping.
does this mean that the intent is that one (or more) of these is true?
in other words, what is it intended by
SELECT ?a ?c
WHERE {
{ SELECT ?a
WHERE {
BIND (bNode('x') AS ?a)
}
}
BIND (bNode('x') AS ?c)
}
or by
SELECT ?x ?c
WHERE {
{ SELECT ?x
WHERE {
BIND (bNode('x') AS ?a)
BIND (bNode('x') AS ?b)
BIND (isBlank(?b) AS ?x)
}
}
BIND (bNode('x') AS ?c)
}
@afs That is not what the spec says though. A solution mapping is defined as a function from variables to RDF terms. So it's a set of bindings, which are pairs of a variable and an RDF term. The empty solution mapping is the empty set. You cannot say “it's a different empty set in that other graph pattern.” The identity of sets is defined by their members.
I understand the intent that you are describing, but either the formalism in the spec doesn't reflect this intent, or else the formalism is not based on standard maths.
Current status
In SPARQL 1.1, the function
BNODE(...)
, when used with an argument (a simple literal or a xsd:string), creates/reuses a blank node associated to that literal in the scope of a single solution mapping. Given the choice for the scope, the version with argument does not add expressiveness to the language, given that this behavior can be replicated by binding the expressionBNODE()
to a variable and then reusing it.Missing expressiveness
In CONSTRUCT queries, there is often the need to generate new resources that are referenced across multiple solution mappings. This can be currently done by generating appropriated URIs and using the function
IRI(...)
. There is no way to generate blank nodes having the same role in the output graph.Proposal
I propose to:
BNODE(...)
expected argument to be any RDF term;BNODE(...)
with the same argument inside the same query call will return the same blank node).Implementation cost
For the implementations I know of, the semantics in SPARQL 1.1 require more work than the ones proposed here: to check if an existing blank node has to be reused, for each query call and each solution binding a different blank node map has to be maintained; in this proposal a single map for each query call is enough.
Backward compatibility
This proposal, as described so far, would not be backwards compatible (it changes the semantics of an existing function), but:
BNODE(...)
with argument is not currently much used;BNODE_UNIQUE(...)
) while the functionBNODE(...)
could keep its previous semantics.