Open nyurik opened 5 years ago
Can you paste in the EXPLAIN of this query? There is a pipelined (streamed) version of COUNT(*). You will see whether or not it is in use in the query plan.
Bryan
On Mon, Dec 3, 2018 at 12:23 PM Yuri Astrakhan notifications@github.com wrote:
I am trying to count number of items per group. There are about 45,000 groups, and the total number of items is in billions. Ideally, internally this query should establish a hashmap with counts, thus allocating under a 100,000 integer counters, yet it seems it tries to generate a full list of items for each bucket, and obviously running out of memory. Is there a way to optimize this, or should this be a feature request for Blazegraph?
SELECT ?osmt (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. } group by ?osmt
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4NqaUPZUQak4_Qzv1G3d2-MtJEAHks5u1YhBgaJpZM4Y_WIs .
Thanks @thompsonbry ! The eval stats (last table) does suggest it is using PipelinedAggregationOp
, yet it still fail with OOM on a beefy 128GB / 12 core / 2TB SSD machine. Java max ram is set to 16GB, just like on Wikidata. All of the info:
QueryContainer
SelectQuery
Select
ProjectionElem
Var (osmt)
ProjectionElem
Count
Var (count)
WhereClause
GraphPatternGroup
BasicGraphPattern
TriplesSameSubjectPath
Var (s1)
PropertyListPath
PathAlternative
PathSequence
PathElt
IRI (https://www.openstreetmap.org/meta/key)
ObjectList
Var (osmt)
TriplesSameSubjectPath
Var (s2)
PropertyListPath
Var (osmt)
ObjectList
Var (v)
GroupClause
GroupCondition
Var (osmt)
AST:
QueryType: SELECT
includeInferred=true
timeout=600000
SELECT VarNode(osmt) ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) )
JoinGroupNode {
StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS]
StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS]
}
group by VarNode(osmt)
Optimized AST:
QueryType: SELECT
includeInferred=true
timeout=600000
SELECT ( VarNode(osmt) AS VarNode(osmt) ) ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) )
JoinGroupNode {
StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS]
AST2BOpBase.estimatedCardinality=4661
AST2BOpBase.originalIndex=POS
StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS]
AST2BOpBase.estimatedCardinality=6650329986
AST2BOpBase.originalIndex=SPO
}
group by ( VarNode(osmt) AS VarNode(osmt) )
Query plan
com.bigdata.bop.rdf.join.ChunkedMaterializationOp[7](ProjectionOp[6])[ ChunkedMaterializationOp.vars=[osmt, count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544121261754, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=865d39fb-f890-4baa-a50f-3ef7f7a67b76, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc]
com.bigdata.bop.solutions.ProjectionOp[6](PipelinedAggregationOp[5])[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[osmt, count]]
com.bigdata.bop.solutions.PipelinedAggregationOp[5](PipelineJoin[4])[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=[com.bigdata.bop.Bind(osmt,osmt)],having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT(*)=d87d5f2c-7b32-4b51-b24d-75a76dd0d25f},select2=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,d87d5f2c-7b32-4b51-b24d-75a76dd0d25f)],having2=null}, PipelineOp.lastPass=true]
com.bigdata.bop.join.PipelineJoin[4](PipelineJoin[2])[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544121261754, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6650329986, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544121261754, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
Eval stats:
evalOrder | bopSummary | predSummary | nvars | fastRangeCount | sumMillis | unitsIn | unitsOut | typeErrors | joinRatio |
---|---|---|---|---|---|---|---|---|---|
total | total | 49853 | 9383526 | 4871826 | 0 | 0.519189268511645 | |||
0 | PipelineJoin[2] | SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) | 2 | 4661 | 0 | 0 | 0 | 0 | N/A |
1 | PipelineJoin[4] | SPOPredicate[3](?s2, ?osmt, ?v) | 3 | 6650329986 | 36600 | 300 | 4871826 | 0 | 16239.42 |
2 | PipelinedAggregationOp[5] | GroupByState{select=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=[com.bigdata.bop.Bind(osmt,osmt)],having=null} | 13253 | 9383526 | 0 | 0 | 0 | ||
3 | ProjectionOp[6] | [osmt, count] | 0 | 0 | 0 | 0 | N/A | ||
4 | ChunkedMaterializationOp[7] | vars=[osmt, count],materializeInlineIVs=true | 0 | 0 | 0 | 0 | N/A |
Exception
SPARQL-QUERY: queryStr=SELECT ?osmt (count(*) as ?count) where {
?s1 osmm:key ?osmt.
?s2 ?osmt ?v.
} group by ?osmt
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:679)
at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290)
at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:337)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:503)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:890)
at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:696)
at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188)
at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68)
at org.openrdf.query.QueryResults.report(QueryResults.java:155)
at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1713)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1569)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1534)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:747)
... 4 more
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:59)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.close(RunningQueryCloseableIterator.java:73)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.hasNext(RunningQueryCloseableIterator.java:82)
at com.bigdata.striterator.ChunkedWrappedIterator.hasNext(ChunkedWrappedIterator.java:197)
at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134)
... 11 more
Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.util.concurrent.Haltable.get(Haltable.java:273)
at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:1516)
at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:104)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:46)
... 15 more
Caused by: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1367)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:926)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:821)
... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1347)
... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:682)
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:382)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1346)
... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1027)
at com.bigdata.bop.join.PipelineJoin$JoinTask.consumeSource(PipelineJoin.java:739)
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:623)
... 12 more
Caused by: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1961)
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.call(PipelineJoin.java:1684)
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.executeTasks(PipelineJoin.java:1392)
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1016)
... 14 more
Caused by: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rwstore.sector.MemoryManager.getSectorFromFreeList(MemoryManager.java:646)
at com.bigdata.rwstore.sector.MemoryManager.allocate(MemoryManager.java:675)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:195)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:169)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:159)
at com.bigdata.rwstore.sector.AllocationContext.alloc(AllocationContext.java:359)
at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:335)
at com.bigdata.rwstore.PSOutputStream.getAddr(PSOutputStream.java:416)
at com.bigdata.bop.solutions.SolutionSetStream.put(SolutionSetStream.java:297)
at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:213)
at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:147)
at com.bigdata.bop.engine.StandaloneChunkHandler.handleChunk(StandaloneChunkHandler.java:92)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.outputChunk(ChunkedRunningQuery.java:1699)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.addReorderAllowed(ChunkedRunningQuery.java:1628)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1569)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1453)
at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:59)
at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:14)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.overflow(AbstractUnsynchronizedArrayBuffer.java:287)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add2(AbstractUnsynchronizedArrayBuffer.java:215)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add(AbstractUnsynchronizedArrayBuffer.java:173)
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1868)
... 17 more
Update: I realized that even a non-grouping count also runs out of memory:
SELECT (count(*) as ?count) where {
?s1 osmm:key ?osmt.
?s2 ?osmt ?v.
}
Parse Tree
QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)
Original AST
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX sesame: <http://www.openrdf.org/schema/sesame#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX fn: <http://www.w3.org/2005/xpath-functions#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX hint: <http://www.bigdata.com/queryHints#> PREFIX bd: <http://www.bigdata.com/rdf#> PREFIX bds: <http://www.bigdata.com/rdf/search#> PREFIX osmroot: <https://www.openstreetmap.org> PREFIX osmnode: <https://www.openstreetmap.org/node/> PREFIX osmway: <https://www.openstreetmap.org/way/> PREFIX osmrel: <https://www.openstreetmap.org/relation/> PREFIX osmm: <https://www.openstreetmap.org/meta/> PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:> PREFIX pageviews: <https://dumps.wikimedia.org/other/pageviews/> PREFIX osmd: <http://wiki.openstreetmap.org/entity/> PREFIX osmdt: <http://wiki.openstreetmap.org/prop/direct/> PREFIX osmds: <http://wiki.openstreetmap.org/entity/statement/> PREFIX osmp: <http://wiki.openstreetmap.org/prop/> PREFIX osmdref: <http://wiki.openstreetmap.org/reference/> PREFIX osmdv: <http://wiki.openstreetmap.org/value/> PREFIX osmps: <http://wiki.openstreetmap.org/prop/statement/> PREFIX osmpsv: <http://wiki.openstreetmap.org/prop/statement/value/> PREFIX osmpsn: <http://wiki.openstreetmap.org/prop/statement/value-normalized/> PREFIX osmpq: <http://wiki.openstreetmap.org/prop/qualifier/> PREFIX osmpqv: <http://wiki.openstreetmap.org/prop/qualifier/value/> PREFIX osmpqn: <http://wiki.openstreetmap.org/prop/qualifier/value-normalized/> PREFIX osmpr: <http://wiki.openstreetmap.org/prop/reference/> PREFIX osmprv: <http://wiki.openstreetmap.org/prop/reference/value/> PREFIX osmprn: <http://wiki.openstreetmap.org/prop/reference/value-normalized/> PREFIX osmdno: <http://wiki.openstreetmap.org/prop/novalue/> PREFIX osmdata: <http://wiki.openstreetmap.org/wiki/Special:EntityData/> PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wds: <http://www.wikidata.org/entity/statement/> PREFIX p: <http://www.wikidata.org/prop/> PREFIX wdref: <http://www.wikidata.org/reference/> PREFIX wdv: <http://www.wikidata.org/value/> PREFIX ps: <http://www.wikidata.org/prop/statement/> PREFIX psv: <http://www.wikidata.org/prop/statement/value/> PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/> PREFIX pq: <http://www.wikidata.org/prop/qualifier/> PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/> PREFIX pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/> PREFIX pr: <http://www.wikidata.org/prop/reference/> PREFIX prv: <http://www.wikidata.org/prop/reference/value/> PREFIX prn: <http://www.wikidata.org/prop/reference/value-normalized/> PREFIX wdno: <http://www.wikidata.org/prop/novalue/> PREFIX wdata: <http://www.wikidata.org/wiki/Special:EntityData/> PREFIX wdtn: <http://wiki.openstreetmap.org/prop/direct-normalized/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX schema: <http://schema.org/> PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/geosparql/function/> PREFIX mediawiki: <https://www.mediawiki.org/ontology#> PREFIX mwapi: <https://www.mediawiki.org/ontology#API/> PREFIX gas: <http://www.bigdata.com/rdf/gas#> PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#> PREFIX dct: <http://purl.org/dc/terms/> QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }
Optimized AST
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX sesame: <http://www.openrdf.org/schema/sesame#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX fn: <http://www.w3.org/2005/xpath-functions#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX hint: <http://www.bigdata.com/queryHints#> PREFIX bd: <http://www.bigdata.com/rdf#> PREFIX bds: <http://www.bigdata.com/rdf/search#> PREFIX osmroot: <https://www.openstreetmap.org> PREFIX osmnode: <https://www.openstreetmap.org/node/> PREFIX osmway: <https://www.openstreetmap.org/way/> PREFIX osmrel: <https://www.openstreetmap.org/relation/> PREFIX osmm: <https://www.openstreetmap.org/meta/> PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:> PREFIX pageviews: <https://dumps.wikimedia.org/other/pageviews/> PREFIX osmd: <http://wiki.openstreetmap.org/entity/> PREFIX osmdt: <http://wiki.openstreetmap.org/prop/direct/> PREFIX osmds: <http://wiki.openstreetmap.org/entity/statement/> PREFIX osmp: <http://wiki.openstreetmap.org/prop/> PREFIX osmdref: <http://wiki.openstreetmap.org/reference/> PREFIX osmdv: <http://wiki.openstreetmap.org/value/> PREFIX osmps: <http://wiki.openstreetmap.org/prop/statement/> PREFIX osmpsv: <http://wiki.openstreetmap.org/prop/statement/value/> PREFIX osmpsn: <http://wiki.openstreetmap.org/prop/statement/value-normalized/> PREFIX osmpq: <http://wiki.openstreetmap.org/prop/qualifier/> PREFIX osmpqv: <http://wiki.openstreetmap.org/prop/qualifier/value/> PREFIX osmpqn: <http://wiki.openstreetmap.org/prop/qualifier/value-normalized/> PREFIX osmpr: <http://wiki.openstreetmap.org/prop/reference/> PREFIX osmprv: <http://wiki.openstreetmap.org/prop/reference/value/> PREFIX osmprn: <http://wiki.openstreetmap.org/prop/reference/value-normalized/> PREFIX osmdno: <http://wiki.openstreetmap.org/prop/novalue/> PREFIX osmdata: <http://wiki.openstreetmap.org/wiki/Special:EntityData/> PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wds: <http://www.wikidata.org/entity/statement/> PREFIX p: <http://www.wikidata.org/prop/> PREFIX wdref: <http://www.wikidata.org/reference/> PREFIX wdv: <http://www.wikidata.org/value/> PREFIX ps: <http://www.wikidata.org/prop/statement/> PREFIX psv: <http://www.wikidata.org/prop/statement/value/> PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/> PREFIX pq: <http://www.wikidata.org/prop/qualifier/> PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/> PREFIX pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/> PREFIX pr: <http://www.wikidata.org/prop/reference/> PREFIX prv: <http://www.wikidata.org/prop/reference/value/> PREFIX prn: <http://www.wikidata.org/prop/reference/value-normalized/> PREFIX wdno: <http://www.wikidata.org/prop/novalue/> PREFIX wdata: <http://www.wikidata.org/wiki/Special:EntityData/> PREFIX wdtn: <http://wiki.openstreetmap.org/prop/direct-normalized/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX schema: <http://schema.org/> PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/geosparql/function/> PREFIX mediawiki: <https://www.mediawiki.org/ontology#> PREFIX mwapi: <https://www.mediawiki.org/ontology#API/> PREFIX gas: <http://www.bigdata.com/rdf/gas#> PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#> PREFIX dct: <http://purl.org/dc/terms/> QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }
Query Plan
com.bigdata.bop.rdf.join.ChunkedMaterializationOp[7](ProjectionOp[6])[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp[6](PipelinedAggregationOp[5])[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp[5](PipelineJoin[4])[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT(*)=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin[4](PipelineJoin[2])[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
Query Evaluation Statistics
evalOrder | bopSummary | predSummary | nvars | fastRangeCount | sumMillis | unitsIn | unitsOut | typeErrors | joinRatio |
---|---|---|---|---|---|---|---|---|---|
total | total | 338682 | 75290133 | 39846393 | 0 | 0.52923791488056 | |||
0 | PipelineJoin[2] | SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) | 2 | 4661 | 13710 | 1 | 4661 | 0 | 4661 |
1 | PipelineJoin[4] | SPOPredicate[3](?s2, ?osmt, ?v) | 3 | 6652258130 | 237526 | 3500 | 39841732 | 0 | 11383.352 |
2 | PipelinedAggregationOp[5] | GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} | 87446 | 75286732 | 0 | 0 | 0 | ||
3 | ProjectionOp[6] | [count] | 0 | 0 | 0 | 0 | N/A | ||
4 | ChunkedMaterializationOp[7] | vars=[count],materializeInlineIVs=true | 0 | 0 | 0 | 0 | N/A |
What is the Java heap? Direct memory assigned?
What is the error that you receive with these queries? Is it the same message? Please include the stack traces.
Blazegraph uses both the Java managed object heap and native heap. However, the amounts of the heap available to Blazegraph are determined by how you start the JVM. Thus, you could be running out of either the native heap or the JVM managed heap, depending on the query. Blazegraph also has an analytic mode and a non-analytic mode. These differ in whether or not the underlying operators target the native heap (analytic) or the managed heap (otherwise).
This is the code which should be handling the query plan above.
Bryan
On Fri, Dec 7, 2018 at 9:09 AM Yuri Astrakhan notifications@github.com wrote:
Update: I realized that even a non-grouping count also runs out of memory:
SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }
Parse Tree QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)
Original AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }
Optimized AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }
Query Plan com.bigdata.bop.rdf.join.ChunkedMaterializationOp7[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp6[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp5[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT())],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT()=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin4[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio total total 338682 75290133 39846393 0 0.52923791488056 0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[ https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 13710 1 4661 0 4661 1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6652258130 237526 3500 39841732 0 11383.352 2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} | | | 87446 | 75286732 | 0 | 0 | 0 3 | ProjectionOp[6] | [count] | | | 0 | 0 | 0 | 0 | N/A 4 | ChunkedMaterializationOp[7] | vars=[count],materializeInlineIVs=true | | | 0 | 0 | 0 | 0 | N/A
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445299771, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DQ9QDBePB1wnQV8oismVFI80-VYks5u2qDUgaJpZM4Y_WIs .
In terms of the explain, you can see that the joins are exploding by 4600x and then by 11000x. You have 39M solutions flowing into the pipelined aggregation operator at this point in time. The range estimate for your last triple pattern (?s2 ?osmt ?v) is 6 billion. That range estimate is taken with nothing bound (they are obtained before the query runs and that triple pattern is all variables). I can't tell from this how many distinct values for ?osmt were observed before this hit the wall. If it is the 45000 you are predicting, then I am pretty curious why it is hitting a memory wall unless the JVM configuration is way off.
Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio
total total 338,682 75,290,133 39,846,393 - 0.53 0 https://www.openstreetmap.org/meta/key PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) 2 4,661 13,710 1 4,661 - 4,661.00 1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6,652,258,130 237,526 3,500 39,841,732 - 11,383.35 2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} 87,446 75,286,732
3 ProjectionOp[6] [count] -
On Fri, Dec 7, 2018 at 10:37 AM Bryan Thompson bryan@blazegraph.com wrote:
What is the Java heap? Direct memory assigned?
What is the error that you receive with these queries? Is it the same message? Please include the stack traces.
Blazegraph uses both the Java managed object heap and native heap. However, the amounts of the heap available to Blazegraph are determined by how you start the JVM. Thus, you could be running out of either the native heap or the JVM managed heap, depending on the query. Blazegraph also has an analytic mode and a non-analytic mode. These differ in whether or not the underlying operators target the native heap (analytic) or the managed heap (otherwise).
This is the code which should be handling the query plan above.
Bryan
On Fri, Dec 7, 2018 at 9:09 AM Yuri Astrakhan notifications@github.com wrote:
Update: I realized that even a non-grouping count also runs out of memory:
SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }
Parse Tree QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)
Original AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }
Optimized AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }
Query Plan com.bigdata.bop.rdf.join.ChunkedMaterializationOp7[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp6[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp5[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT())],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT()=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin4[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio total total 338682 75290133 39846393 0 0.52923791488056 0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[ https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 13710 1 4661 0 4661 1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6652258130 237526 3500 39841732 0 11383.352 2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} | | | 87446 | 75286732 | 0 | 0 | 0 3 | ProjectionOp[6] | [count] | | | 0 | 0 | 0 | 0 | N/A 4 | ChunkedMaterializationOp[7] | vars=[count],materializeInlineIVs=true | | | 0 | 0 | 0 | 0 | N/A
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445299771, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DQ9QDBePB1wnQV8oismVFI80-VYks5u2qDUgaJpZM4Y_WIs .
openjdk version "1.8.0_181"
, with wikibase customizations, ran with these params:
java
-server
-XX:+UseG1GC
-Xmx12288m
-Xloggc:/var/log/wdqs/wdqs-blazegraph_jvm_gc.%p-%t.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintAdaptiveSizePolicy
-XX:+PrintReferenceGC
-XX:+PrintGCCause
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintTenuringDistribution
-XX:+UnlockExperimentalVMOptions
-XX:G1NewSizePercent=20
-XX:+ParallelRefProcEnabled
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=20M
-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=RWStore.properties
-Dorg.eclipse.jetty.server.Request.maxFormContentSize=200000000
-Dcom.bigdata.rdf.sparql.ast.QueryHints.analytic=true
-Dcom.bigdata.rdf.sparql.ast.QueryHints.analyticMaxMemoryPerQuery=1073741824
-DASTOptimizerClass=org.wikidata.query.rdf.blazegraph.WikibaseOptimizers
-Dorg.wikidata.query.rdf.blazegraph.inline.literal.WKTSerializer.noGlobe=2
-Dcom.bigdata.rdf.sail.webapp.client.RemoteRepository.maxRequestURLLength=7168
-Dcom.bigdata.rdf.sail.sparql.PrefixDeclProcessor.additionalDeclsFile=./prefixes.conf
-Dorg.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceFactory.config=./mwservices.json
-Dcom.bigdata.rdf.sail.webapp.client.HttpClientConfigurator=org.wikidata.query.rdf.blazegraph.ProxiedHttpConnectionFactory
-Dhttp.userAgent=Sophox - OSM Query Service; https://sophox.org/
-Dorg.eclipse.jetty.annotations.AnnotationParser.LEVEL=OFF
-DwikibaseConceptUri=http://wiki.openstreetmap.org
-DwikibaseServiceEnableWhitelist=false
-jar jetty-runner-9.4.12.v20180830.jar
--host 0.0.0.0
--port 9999
--path /bigdata blazegraph-service-0.3.1-SNAPSHOT.war
Exception info:
SPARQL-QUERY: queryStr=SELECT (count(*) as ?count) where {
?s1 osmm:key ?osmt.
?s2 ?osmt ?v.
}
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:679)
at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290)
at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:337)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:503)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:890)
at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:696)
at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188)
at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68)
at org.openrdf.query.QueryResults.report(QueryResults.java:155)
at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1713)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1569)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1534)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:747)
... 4 more
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:59)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.close(RunningQueryCloseableIterator.java:73)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.hasNext(RunningQueryCloseableIterator.java:82)
at com.bigdata.striterator.ChunkedWrappedIterator.hasNext(ChunkedWrappedIterator.java:197)
at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134)
... 11 more
Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.util.concurrent.Haltable.get(Haltable.java:273)
at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:1516)
at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:104)
at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:46)
... 15 more
Caused by: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1367)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:926)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:821)
... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1347)
... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:682)
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:382)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1346)
... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1027)
at com.bigdata.bop.join.PipelineJoin$JoinTask.consumeSource(PipelineJoin.java:739)
at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:623)
... 12 more
Caused by: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1961)
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.call(PipelineJoin.java:1684)
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.executeTasks(PipelineJoin.java:1392)
at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1016)
... 14 more
Caused by: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
at com.bigdata.rwstore.sector.MemoryManager.getSectorFromFreeList(MemoryManager.java:646)
at com.bigdata.rwstore.sector.MemoryManager.allocate(MemoryManager.java:675)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:195)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:169)
at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:159)
at com.bigdata.rwstore.sector.AllocationContext.alloc(AllocationContext.java:359)
at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:335)
at com.bigdata.rwstore.PSOutputStream.getAddr(PSOutputStream.java:416)
at com.bigdata.bop.solutions.SolutionSetStream.put(SolutionSetStream.java:297)
at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:213)
at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:147)
at com.bigdata.bop.engine.StandaloneChunkHandler.handleChunk(StandaloneChunkHandler.java:92)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.outputChunk(ChunkedRunningQuery.java:1699)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.addReorderAllowed(ChunkedRunningQuery.java:1628)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1569)
at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1453)
at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:59)
at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:14)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.overflow(AbstractUnsynchronizedArrayBuffer.java:287)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add2(AbstractUnsynchronizedArrayBuffer.java:215)
at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add(AbstractUnsynchronizedArrayBuffer.java:173)
at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1868)
... 17 more
Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.
SELECT (count(distinct ?osmt) as ?count) where {
?s1 osmm:key ?osmt.
}
It is out of native memory. The error is from the memory manager. I do not see the option to grant the JVM the ability to use native memory ? Just an option to limit it to 1G native memory per query.
@stas: could you make sure that the best practices are captured for this and for diagnosing the OOM as Java heap, GC OH exceeded, or native heap?
Thanks, Bryan
On Fri, Dec 7, 2018 at 11:25 AM Yuri Astrakhan notifications@github.com wrote:
Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.
SELECT (count(distinct ?osmt) as ?count) where { ?s1 osmm:key ?osmt. }
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445338947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4OMA6KuHwxyABjVHmnVMcj_6tA2Gks5u2sC8gaJpZM4Y_WIs .
There might be very little native memory available by default. This could lead me to an OOM of the native heap unless you explicitly allow direct memory to be used by the JVM
On Fri, Dec 7, 2018 at 11:52 AM Bryan Thompson bryan@blazegraph.com wrote:
It is out of native memory. The error is from the memory manager. I do not see the option to grant the JVM the ability to use native memory ? Just an option to limit it to 1G native memory per query.
@stas: could you make sure that the best practices are captured for this and for diagnosing the OOM as Java heap, GC OH exceeded, or native heap?
Thanks, Bryan
On Fri, Dec 7, 2018 at 11:25 AM Yuri Astrakhan notifications@github.com wrote:
Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.
SELECT (count(distinct ?osmt) as ?count) where { ?s1 osmm:key ?osmt. }
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445338947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4OMA6KuHwxyABjVHmnVMcj_6tA2Gks5u2sC8gaJpZM4Y_WIs .
Bryan, thank you for all your help! The java runtime configuration is being set in runBlazegraph.sh script, part of Wikibase rdf repo. I only set the heap memory (-Xmx
) CC: @smalyshev
According to this article (point 2) about DirectByteBuffer native memory. Not sure if that's the same thing.
By default, it’s equal to -Xmx. Yes, the JVM heap and off-heap memory are two different memory areas, but by default, they have the same maximum size.
BTW, all this is for Sophox, a pro-bono service for OpenStreetMap to query all of OSM data, metadata, wikipedia pageview stats, etc. Accessible via https://sophox.org
Use this option to be sure.
-
The limit can be changed using -XX:MaxDirectMemorySize property. This property accepts acronyms like “g” or “G” for gigabytes, etc.
On Fri, Dec 7, 2018 at 12:41 PM Yuri Astrakhan notifications@github.com wrote:
Bryan, thank you for all your help! The java runtime configuration is being set in runBlazegraph.sh https://github.com/wikimedia/wikidata-query-rdf/blob/master/dist/src/script/runBlazegraph.sh script, part of Wikibase rdf repo. I only set the heap memory (-Xmx) CC: @smalyshev https://github.com/smalyshev
According to this article https://dzone.com/articles/troubleshooting-problems-with-native-off-heap-memo (point 2) about DirectByteBuffer native memory. Not sure if that's the same thing.
By default, it’s equal to -Xmx. Yes, the JVM heap and off-heap memory are two different memory areas, but by default, they have the same maximum size.
BTW, all this is for Sophox https://wiki.openstreetmap.org/wiki/Sophox, a pro-bono service for OpenStreetMap to query all of OSM data, metadata, wikipedia pageview stats, etc. Accessible via https://sophox.org
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445359169, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4NUdm1CCGytee95KZO8Uc8IoXlqJks5u2tJmgaJpZM4Y_WIs .
I just re-ran the same query with -XX:MaxDirectMemorySize=40g
, and got exactly the same error (same stack trace), after about the same ~5 min wait. I even tried 70g, same result. BTW, not sure if that's related -- every minute the server is updated with a few MBs of changes - could that affect the issue?
Did you also raise the max direct memory per query? It was set at 1G as I recall.
On Fri, Dec 7, 2018 at 6:55 PM Yuri Astrakhan notifications@github.com wrote:
I just re-ran the same query with -XX:MaxDirectMemorySize=40g, and got exactly the same error (same stack trace), after about the same ~5 min wait. I even tried 70g, same result. BTW, not sure if that's related -- every minute the server is updated with a few MBs of changes - could that affect the issue?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445424566, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4KkyFcuqj7QwqeZwWKpuOPs1UCH4ks5u2yo0gaJpZM4Y_WIs .
With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long or require so much memory, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt.
produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!
SELECT (count(*) as ?count) where {
?s1 osmm:key ?osmt.
?s2 ?osmt ?v.
}
It is likely enqueing a lot of data in memory during the query. You might try looking at the JVM heap and the number of native buffers used (/status and performance counters). Wiki data is setup to prefer native memory use over java memory use to provide tighter controls on the memory used (DOS concerns). That can impose higher overhead during the query.
Michael, are you aware of anything which would hold onto the memory for this query? I seem to recall you have looked at a few possible query scope memory leaks.
Bryan
On Sat, Dec 8, 2018 at 7:26 PM Yuri Astrakhan notifications@github.com wrote:
With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt. produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!
SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445507947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HsbpPAXWrEKyX4UmL3uvMOY2hbtks5u3IMCgaJpZM4Y_WIs .
--
Bryan Thompson Chief Scientist & Founder Blazegraph e: bryan@blazegraph.com w: http://blazegraph.com
Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” http://insideanalysis.com/2016/01/20535/.
Blazegraph Database https://www.blazegraph.com/ is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU https://www.blazegraph.com/product/gpu-accelerated/ andBlazegraph DAS https://www.blazegraph.com/product/gpu-accelerated/L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions.
CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments.
Thanks Bryan, I will run it a few more times tomorrow. Would there be any security/privacy/stability issues if I allow world's access to the /bigdata
(GET only) ? This way you could take a look at them directly, and perhaps we could write up some docs on perf optimization.
There are docs on perf optimization on the wiki.
https://wiki.blazegraph.com/wiki/index.php/Main_Page
Bryan
On Sat, Dec 8, 2018 at 7:55 PM Yuri Astrakhan notifications@github.com wrote:
Thanks Bryan, I will run it a few more times tomorrow. Would there be any security/privacy/stability issues if I allow world's access to the /bigdata (GET only) ? This way you could take a look at them directly, and perhaps we could write up some docs on perf optimization.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445509107, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DzNH9VpfHQCLNjpwjCMbu-DAzRLks5u3ImzgaJpZM4Y_WIs .
Not really, this should run with more or less constant memory consumption -- it should all be pipelined through, with limited size queues in front of the operators. You could look at query explain to understand the dynamics, but it would probably not tell you much about where/whether things are blowing up. Agree with Bryan, performance counters are probably most promising (or a profiler, if possible)...
Am Sa., 8. Dez. 2018 um 19:37 Uhr schrieb Bryan Thompson bryan@systap.com:
It is likely enqueing a lot of data in memory during the query. You might try looking at the JVM heap and the number of native buffers used (/status and performance counters). Wiki data is setup to prefer native memory use over java memory use to provide tighter controls on the memory used (DOS concerns). That can impose higher overhead during the query.
Michael, are you aware of anything which would hold onto the memory for this query? I seem to recall you have looked at a few possible query scope memory leaks.
Bryan
On Sat, Dec 8, 2018 at 7:26 PM Yuri Astrakhan notifications@github.com wrote:
With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt. produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!
SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445507947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HsbpPAXWrEKyX4UmL3uvMOY2hbtks5u3IMCgaJpZM4Y_WIs .
--
Bryan Thompson Chief Scientist & Founder Blazegraph e: bryan@blazegraph.com w: http://blazegraph.com
Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” http://insideanalysis.com/2016/01/20535/.
Blazegraph Database https://www.blazegraph.com/ is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU https://www.blazegraph.com/product/gpu-accelerated/ andBlazegraph DAS https://www.blazegraph.com/product/gpu-accelerated/L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions.
CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments.
I got the same com.bigdata.rwstore.sector.MemoryManagerOutOfMemory counting about 51 million records.
Using a subquery with a very large limit (larger than the expected count obviously) lets the query run without this error (I have no timeout set):
PREFIX vrank:<http://purl.org/voc/vrank#>
SELECT (COUNT(?b) as ?c) WHERE {
SELECT ?b {?a vrank:hasRank/vrank:rankValue ?b .}
LIMIT 100000000 #one hundred million
}
Maybe it helps someone.
I am trying to count number of items per group. There are about 45,000 groups, and the total number of items is in billions. Ideally, internally this query should establish a hashmap with counts, thus allocating under a 100,000 integer counters, yet it seems it tries to generate a full list of items for each bucket, and obviously running out of memory. Is there a way to optimize this, or should this be a feature request for Blazegraph?