eclipse-rdf4j / rdf4j

Eclipse RDF4J: scalable RDF for Java
https://rdf4j.org/
BSD 3-Clause "New" or "Revised" License
361 stars 163 forks source link

Deprecate SPARQLQueryRenderer for removal #4249

Open hmottestad opened 1 year ago

hmottestad commented 1 year ago

The SPARQLQueryRenderer is incomplete and outputs incorrect results for common queries. There does not seem to be any documentation of what subset of the query language that is supported and there are no errors to indicate that the output is incomplete.

Examples

GROUP BY

Input

select (count(*) as ?cnt) (sample(?s) as ?sample)
where {
  ?s ?p ?o.
} group by ?o

Output

select ?cnt ?sample
where {
  ?s ?p ?o.
  ?s  bind( as ?cnt).
  bind(?s as ?sample).
}

Nested SELECT

Input

select * 
where {
  { 
    select ?s where {
      ?s ?p ?o
    } LIMIT 1
  }

  ?s a ?o. 
}

Output

select ?s ?o ?s
where {
  ?s ?p ?o.
  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?o.
}
limit 1

VALUES

Input

select * where {
  VALUES ?c {1 2 3}
  ?a <http://example.org/number> ?c
}

Output

select ?c ?a
where {
  ?a <http://example.org/number> ?c.
}
hmottestad commented 1 year ago

@jeenbroekstra @barthanssens @aschwarte10

What do you guys think about deprecating the SPARQLQueryRenderer and removing it in 5.0.0? I've looked at the code and I think that the code needs a lot of work in order to support the full SPARQL 1.1 specification together with the RDF-star extensions.

aschwarte10 commented 1 year ago

We are currently using the SPARQL Query Renderer actively in our product and for our purpose it does a fairly good job.

Note that we do have a high number of unit tests (incl. the test query set from the SPARQL.JS project) and for those the SPARQL renderer produces an equivalent query (except for whitespaces).

I personally think that this is a valuable feature in RDF4J to allow modifications on the AST level and then produce a compatible query string.

abrokenjester commented 1 year ago

I agree with @aschwarte10 that this is a valuable feature, even as incomplete as it currently is. However perhaps we should de-blur what we're talking about here.

We currently have two query renderer implementations for SPARQL in RDF4J:

@aschwarte10 it's been a while since I've been in the Metaphacts code base of course, can you verify which of these two implementations you're currently working with?

@hmottestad instead of just completely removing the renderer feature in 5.0, perhaps we should just drop the old implementation, and replace with the newer one. I'm a little hazy on the details but I seem to remember that although the new implementation covers more of the spec, it also has some edges cases where it doesn't cover some bits that the old implementation does (which is why we couldn't just do this as a patch or minor fix). But a major release might be a good point to do that swap.

aschwarte10 commented 1 year ago

We are using org.eclipse.rdf4j.queryrender.sparql.experimental.SparqlQueryRenderer actively and this is in fact working quite OK. As mentionend, we are do have a high number of unit tests (incl. a large query test set from the SPARQL.js project). The renderer produces (except for whitespaces) equivalent queries.

I am strongly in favor of keeping this implementation

hmottestad commented 1 year ago

Ok. We can deprecate org.eclipse.rdf4j.queryrender.sparql.SPARQLQueryRender and remove it in 5.0.0. We can leave org.eclipse.rdf4j.queryrender.sparq.experimental.SparqlQueryRenderer as is until it is complete (and also so we don't break anything).