ad-freiburg / qlever

Very fast SPARQL Engine, which can handle very large knowledge graphs like the complete Wikidata, offers context-sensitive autocompletion for SPARQL queries, and allows combination with text search. It's faster than engines like Blazegraph or Virtuoso, especially for queries involving large result sets.
Apache License 2.0
424 stars 52 forks source link

query causing QLever to crash #1606

Open pfps opened 2 weeks ago

pfps commented 2 weeks ago

This query appears to cause the QLever Wikidata service to crash:

SELECT DISTINCT ?first ?firstLabel WHERE { { SELECT DISTINCT ?c WHERE { ?c wdt:P31/wdt:P279/wdt:P31/wdt:P279/wdt:P31 wd:Q24017465 . } } ?first wdt:P279* ?c . OPTIONAL { ?first rdfs:label ?firstLabel . FILTER ( lang(?firstLabel) = 'en' ) } }

On my local QLever server evaluating the query causes unbounded memory usage.

I think that this query used to work a few months ago.

pfps commented 2 weeks ago

SELECT DISTINCT ?first ?firstLabel WHERE { ?c wdt:P31/wdt:P279/wdt:P31/wdt:P279/wdt:P31 wd:Q24017465 . ?first wdt:P279* ?c . OPTIONAL { ?first rdfs:label ?firstLabel . FILTER ( lang(?firstLabel) = 'en' ) } }

may cause the same problem.

pfps commented 2 weeks ago

I'm also seeing queries that do not cause a memory panic but have a resident memory footprint of 180GB on a 196GB machine when the max memory size is set at 96GB and the max cache size is set at 32GB. Is this expected?

hannahbast commented 4 days ago

@pfps Thanks for pointing this out to us and for the reminder. That is indeed a recent regression and should not happen. We will investigate asap.

Notifying @RobinTF and @joka921

hannahbast commented 4 days ago

We have identified https://github.com/ad-freiburg/qlever/pull/1595 as the cause of the problem. If it's reverted (which I now did for https://qlever.cs.uni-freiburg.de/wikidata), a query likes https://qlever.cs.uni-freiburg.de/wikidata/Gs948g runs through without problems (even with a rather small amount of RAM).

We will investigate further