apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.08k stars 643 forks source link

`UNDEF` in `VALUES` doesn't work with `SERVICE` #2556

Closed mhoangvslev closed 5 days ago

mhoangvslev commented 5 days ago

Version

5.0.0

What happened?

Given the query below:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT DISTINCT ?product ?label ?propertyTextual WHERE {
    VALUES ?bgp1 { <https://query.wikidata.org/sparql> UNDEF }

    SERVICE ?bgp1 { 
        ?item wdt:P31 wd:Q146. # Must be a cat
    } 
}

Executing this query cause error: Service URI not bound: ?bgp1

According to the spec, using UNDEF should not raise an error:

Data can be directly written in a graph pattern or added to a query using VALUES. VALUES provides inline data as a solution sequence which are combined with the results of query evaluation by a join operation. It can be used by an application to provide specific requirements on query results and also by SPARQL query engine implementations that provide federated query through the SERVICE keyword to send a more constrained query to a remote query service.

Relevant output and stacktrace

No response

Are you interested in making a pull request?

None

afs commented 5 days ago

See https://www.w3.org/TR/sparql11-federated-query/#variableService

Variables in the SERVICE clause are "best effort". <https://query.wikidata.org/sparql> makes sense but an unbound variable does indicate where to execute the SERVICE.

It has to be constrained somehow. What are you expecting? If you want it to ignore, then SERVICE SILENT may do what you want - it will generate a warning but otherwise skip the execution of that step.

mhoangvslev commented 5 days ago

In my use-case, since the data in VALUES are retrieved from external process, there is no warranty that the variables would be bound eventually, i.e, having UNDEF as value. In this case yes, I expect the BGP under SERVICE to simply be ignored. That said, SERVICE SILENT is satisfactory enough, although its usage can cause important errors to get ignored, e.g. HTTP 300-500.

afs commented 5 days ago

All SERVICE execution goes through an extension point ServiceExec.exec.

One collection of extensions, focued on federated query performance, are: https://jena.apache.org/documentation/query/service_enhancer.html

You can add your own policy via ServiceExec.exec.

mhoangvslev commented 5 days ago

Alright, many thanks for your prompt assistance ☺️