Correlation and Substitution in SPARQL

zacharywhitley commented 5 years ago

This isn't really a new feature but I came across this paper and thought that it should at least be brought up.

"In the current sparql specification the notion of correlation and substitution are not well defined. This problem triggers several ambiguities in the semantics. In fact, implementations as Fuseki, Blazegraph, Virtuoso and rdf4j assume different semantics."

https://arxiv.org/pdf/1606.01441.pdf

lisp commented 5 years ago

please describe the issue.

zacharywhitley commented 5 years ago

Please read the description

lisp commented 5 years ago

i did. i read also the article. the description is not sufficient.

zacharywhitley commented 5 years ago

Your first response said, “please describe the issue” and now you’re saying that it’s not sufficient. I don’t really have a problem with expanding on the description but I do have a problem with the way you’re asking so the answer is no.

I added this because I thought that some people might appreciate it or like to discuss it but you are free to ignore it or delete it.

lisp commented 5 years ago

the initial text is not a description. it quotes the first sentence from a paper's abstract. that quote introduces terms which that paper defines, uses the terms to allude to a deficiency of the current sparql recommendation and suggests this deficiency relates somehow to interoperability among four named implementations.

it does not describe the issue.

kasei commented 5 years ago

I don’t really have a problem with expanding on the description but I do have a problem with the way you’re asking so the answer is no.

This is not a productive way to continue this conversation. While @lisp's comment might have been brusque, I agree with him that the issue description doesn't give enough information. For those not in a position to read an entire technical report to understand the issue being raised here, can you provide a summary? Is this the identical to the issue that the SPARQL Exists Community Group is meant to address? Overlapping with it?

zacharywhitley commented 5 years ago

I'm kind of busy but @lisp says he read it so I think he is more than adequately prepared to write a summary that is precisely to his liking. I'm looking forward to reading it.

lisp commented 5 years ago

as a start, if i may transcribe from seaborne's note on this topic in issue #1:

Peter wrote a summary email: https://lists.w3.org/Archives/Public/public-sparql-exists/2016Jul/0014.html

the cited email is also not an issue description, but it does introduce the topic in more detail. if @pfps does not want to author this issue, i could provide an initial draft, but i would prefer to defer to someone with more authority on the matter.

pfpschneider commented 5 years ago

Yes, this isn't a new feature. Instead it points out problems with EXISTS in the SPARQL 1.1 specification. Thus this appears to be out of scope for the CG.

I also hesitate to recommend the referenced document as it does not have adequate pointers to related work (some of it mine).

klinovp commented 5 years ago

This is not just about EXISTS. This also has implications for parameterisation of SPARQL queries, which is a pretty common use case but different systems do it in different ways and there are queries which produce different results under different interpretations.

The problem statement is: given a query Q and a variable replacement map M (a partial mapping from variables to constants), evaluate Q under M.

Possible interpretations:

via textual substitution: for every variable in M syntactically replace every occurrence of ?v in Q by the corresponding constant M(v), statically bind ?v to M(v) at the top
via join: add VALUES mapping variables to constants according to M to the end of the query

Usually the results are the same, but not always. Here's a simple example:

Q: select * { bind(?x as ?y) }
M: ?x -> "abc"

Under the substitution semantics the query returns {?x = "abc", ?y = "abc"} but under the join semantics the query returns {?x = "abc" } (?y is not bound because the BIND node is evaluated before the join).

Differences also occur in the presence of subqueries which use the same variable names as in the outer query but don't project them (i.e. they are in fact different variables).

This causes very real issues in practice so it'd be good to define the notion of substitution in SPARQL.

namedgraph commented 5 years ago

@klinovp I think @lisp has something to add about this :) Dydra is using the join substitution AFAIK.

lisp commented 5 years ago

Possible interpretations:

via textual substitution: for every variable in M syntactically replace every occurrence of ?v in Q by the corresponding constant M(v), statically bind ?v to M(v) at the top

via join: add VALUES mapping variables to constants according to M to the end of the query

an additional possible interpretation is to define sparql query execution in terms of environments and introduce "dynamic" environments in addition to solution sets.

that is what dydra does.

VladimirAlexiev commented 3 years ago

@klinovp Under the substitution semantics the query returns {?x = "abc", ?y = "abc"}

I think the query returns {?y = "abc"} because textual substitution obliterates ?x as a variable?

klinovp commented 3 years ago

That's what would happen without the statically bind ?v to M(v) at the top part! Sorry didn't make it precise enough but it'd be BIND("abc" as ?x) (that's how it works in Stardog). So the variable name would be preserved.

Actually since that discussion @afs pointed me to his work https://afs.github.io/substitute which looks like a good technical proposal to me.

lisp commented 3 years ago

yes, @afs describes a mechanism to implement dynamic bindings. it would make sense to generalize it - that provides also a consistent approach to external request arguments

w3c / sparql-dev

Correlation and Substitution in SPARQL #89