SolidLabResearch / generic-data-viewer-react-admin

MIT License
0 stars 1 forks source link

Find a better way to calculate the total number of items in a query #120

Closed mvanbrab closed 3 months ago

mvanbrab commented 4 months ago

Currently, before the actual input SPARQL query is executed, counting is done by making a modified version of the input SPARQL query, changing it into a query of type SELECT COUNT (*) ... This only works for a subset of all possible input SPARQL queries. For example, this input SPARQL query, when executed over two sources, containing triples with same ?s ?o but different ?p, will result in excessive counts:

SELECT DISTINCT ?s ?o WHERE {
  ?s ?p ?o
}

while this one will count correctly:

SELECT DISTINCT ?s ?p ?o WHERE {
  ?s ?p ?o
}

This is easy to test in the test query "A test on DISTINCT LIMIT OFFSET" by modifying the input SPARQL query into the above alternatives.

According to @rubensworks, it is not less efficient to stream all results using the input SPARQL query and count its results while consuming the stream.