ad-freiburg / qlever

Very fast SPARQL Engine, which can handle very large knowledge graphs like the complete Wikidata, offers context-sensitive autocompletion for SPARQL queries, and allows combination with text search. It's faster than engines like Blazegraph or Virtuoso, especially for queries involving large result sets.
Apache License 2.0
369 stars 44 forks source link

Assertion `!dpTab[k - 1].empty()` failed in `QueryPlanner.cpp` #1487

Open dpriskorn opened 1 week ago

dpriskorn commented 1 week ago

image https://qlever.cs.uni-freiburg.de/wikidata/9IJGud Assertion !dpTab[k - 1].empty() failed. Please report this to the developers. In file "/local/data-ssd/qlever/qlever-code/src/engine/QueryPlanner.cpp " at line 1228

hannahbast commented 1 week ago

@dpriskorn Thanks for pointing that out to us. The query is impossibly hard (the inner query is a cross-product of the whole index with itself, with a result size of around 1 quintillion, that is 1000 billion billion). But, of course, it should not produce an assertion failure.

May I ask what you intended to do with this query? You gave the variables names which suggest a certain semantics. But without associating these variables with any predicates or entities, they are just free variables that match everything.

tuukka commented 1 week ago

Here's a much easier query that should give the wanted result (based on the variable names), but it runs out of memory when I try to run it: https://qlever.cs.uni-freiburg.de/wikidata/c0KH0F

PREFIX prov: <http://www.w3.org/ns/prov#>
SELECT (COUNT(DISTINCT ?item) AS ?itemsWithReferencedStatements) 
WHERE { 
    ?item ?property ?statement.
    ?statement prov:wasDerivedFrom ?reference.
}
hannahbast commented 1 week ago

@tuukka I agree it's easier than computing a join between two tables of size 30 billion, but it's still a join between a table of size 30 billion (all triples matching ?item ?property ?statement) and a table of size 1 billion (all triples matching ?statement prov:wasDerivedFrom ?reference). With the "lazy group by", "lazy join", etc. that we are currently working on, this will also work with limited RAM, but it's still a very costly query, which will take a long time to process.

tuukka commented 1 week ago

Here's a query that avoids the join and the 30 billion triple table, but it still runs out of memory: https://qlever.cs.uni-freiburg.de/wikidata/MpDkiF

PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wds: <http://www.wikidata.org/entity/statement/>

SELECT (COUNT(DISTINCT ?item) AS ?itemsWithReferencedStatements) 
WHERE {
    ?statement prov:wasDerivedFrom ?reference.
    BIND(STRAFTER(STRBEFORE(STR(?statement), "-"), STR(wds:)) AS ?item)
}