blazegraph / database

Blazegraph High Performance Graph Database
GNU General Public License v2.0
891 stars 172 forks source link

Optimizing very large counting query with OOM #108

Open nyurik opened 5 years ago

nyurik commented 5 years ago

I am trying to count number of items per group. There are about 45,000 groups, and the total number of items is in billions. Ideally, internally this query should establish a hashmap with counts, thus allocating under a 100,000 integer counters, yet it seems it tries to generate a full list of items for each bucket, and obviously running out of memory. Is there a way to optimize this, or should this be a feature request for Blazegraph?

SELECT ?osmt (count(*) as ?count) where {
  ?s1 osmm:key ?osmt.
  ?s2 ?osmt ?v.
} group by ?osmt
thompsonbry commented 5 years ago

Can you paste in the EXPLAIN of this query? There is a pipelined (streamed) version of COUNT(*). You will see whether or not it is in use in the query plan.

Bryan

On Mon, Dec 3, 2018 at 12:23 PM Yuri Astrakhan notifications@github.com wrote:

I am trying to count number of items per group. There are about 45,000 groups, and the total number of items is in billions. Ideally, internally this query should establish a hashmap with counts, thus allocating under a 100,000 integer counters, yet it seems it tries to generate a full list of items for each bucket, and obviously running out of memory. Is there a way to optimize this, or should this be a feature request for Blazegraph?

SELECT ?osmt (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. } group by ?osmt

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4NqaUPZUQak4_Qzv1G3d2-MtJEAHks5u1YhBgaJpZM4Y_WIs .

nyurik commented 5 years ago

Thanks @thompsonbry ! The eval stats (last table) does suggest it is using PipelinedAggregationOp, yet it still fail with OOM on a beefy 128GB / 12 core / 2TB SSD machine. Java max ram is set to 16GB, just like on Wikidata. All of the info:

QueryContainer
 SelectQuery
  Select
   ProjectionElem
    Var (osmt)
   ProjectionElem
    Count
    Var (count)
  WhereClause
   GraphPatternGroup
    BasicGraphPattern
     TriplesSameSubjectPath
      Var (s1)
      PropertyListPath
       PathAlternative
        PathSequence
         PathElt
          IRI (https://www.openstreetmap.org/meta/key)
       ObjectList
        Var (osmt)
     TriplesSameSubjectPath
      Var (s2)
      PropertyListPath
       Var (osmt)
       ObjectList
        Var (v)
  GroupClause
   GroupCondition
    Var (osmt)

AST:

QueryType: SELECT
includeInferred=true
timeout=600000
SELECT VarNode(osmt) ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) )
  JoinGroupNode {
    StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS]
    StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS]
  }
group by VarNode(osmt)

Optimized AST:

QueryType: SELECT
includeInferred=true
timeout=600000
SELECT ( VarNode(osmt) AS VarNode(osmt) ) ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) )
  JoinGroupNode {
    StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS]
      AST2BOpBase.estimatedCardinality=4661
      AST2BOpBase.originalIndex=POS
    StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS]
      AST2BOpBase.estimatedCardinality=6650329986
      AST2BOpBase.originalIndex=SPO
  }
group by ( VarNode(osmt) AS VarNode(osmt) )

Query plan

com.bigdata.bop.rdf.join.ChunkedMaterializationOp[7](ProjectionOp[6])[ ChunkedMaterializationOp.vars=[osmt, count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544121261754, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=865d39fb-f890-4baa-a50f-3ef7f7a67b76, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc]
  com.bigdata.bop.solutions.ProjectionOp[6](PipelinedAggregationOp[5])[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[osmt, count]]
    com.bigdata.bop.solutions.PipelinedAggregationOp[5](PipelineJoin[4])[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=[com.bigdata.bop.Bind(osmt,osmt)],having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT(*)=d87d5f2c-7b32-4b51-b24d-75a76dd0d25f},select2=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,d87d5f2c-7b32-4b51-b24d-75a76dd0d25f)],having2=null}, PipelineOp.lastPass=true]
      com.bigdata.bop.join.PipelineJoin[4](PipelineJoin[2])[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544121261754, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6650329986, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]
        com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544121261754, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]

Eval stats:

evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio
total total 49853 9383526 4871826 0 0.519189268511645
0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 0 0 0 0 N/A
1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6650329986 36600 300 4871826 0 16239.42
2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(osmt,osmt), com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=[com.bigdata.bop.Bind(osmt,osmt)],having=null} 13253 9383526 0 0 0
3 ProjectionOp[6] [osmt, count] 0 0 0 0 N/A
4 ChunkedMaterializationOp[7] vars=[osmt, count],materializeInlineIVs=true 0 0 0 0 N/A

Exception

SPARQL-QUERY: queryStr=SELECT ?osmt (count(*) as ?count) where {
 ?s1 osmm:key ?osmt.
 ?s2 ?osmt ?v.
} group by ?osmt
java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:679)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290)
    at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
    at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
    at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:337)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
    at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
    at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.Server.handle(Server.java:503)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
    at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:890)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:696)
    at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more
Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188)
    at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68)
    at org.openrdf.query.QueryResults.report(QueryResults.java:155)
    at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1713)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1569)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1534)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:747)
    ... 4 more
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:59)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.close(RunningQueryCloseableIterator.java:73)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.hasNext(RunningQueryCloseableIterator.java:82)
    at com.bigdata.striterator.ChunkedWrappedIterator.hasNext(ChunkedWrappedIterator.java:197)
    at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134)
    ... 11 more
Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.util.concurrent.Haltable.get(Haltable.java:273)
    at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:1516)
    at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:104)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:46)
    ... 15 more
Caused by: java.lang.Exception: task=ChunkTask{query=487ad640-9dd6-4dab-86e5-1756ad841e55,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1367)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:926)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:821)
    ... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1347)
    ... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:682)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:382)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1346)
    ... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1027)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.consumeSource(PipelineJoin.java:739)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:623)
    ... 12 more
Caused by: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1961)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.call(PipelineJoin.java:1684)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.executeTasks(PipelineJoin.java:1392)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1016)
    ... 14 more
Caused by: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rwstore.sector.MemoryManager.getSectorFromFreeList(MemoryManager.java:646)
    at com.bigdata.rwstore.sector.MemoryManager.allocate(MemoryManager.java:675)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:195)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:169)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:159)
    at com.bigdata.rwstore.sector.AllocationContext.alloc(AllocationContext.java:359)
    at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:335)
    at com.bigdata.rwstore.PSOutputStream.getAddr(PSOutputStream.java:416)
    at com.bigdata.bop.solutions.SolutionSetStream.put(SolutionSetStream.java:297)
    at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:213)
    at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:147)
    at com.bigdata.bop.engine.StandaloneChunkHandler.handleChunk(StandaloneChunkHandler.java:92)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.outputChunk(ChunkedRunningQuery.java:1699)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.addReorderAllowed(ChunkedRunningQuery.java:1628)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1569)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1453)
    at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:59)
    at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:14)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.overflow(AbstractUnsynchronizedArrayBuffer.java:287)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add2(AbstractUnsynchronizedArrayBuffer.java:215)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add(AbstractUnsynchronizedArrayBuffer.java:173)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1868)
    ... 17 more
nyurik commented 5 years ago

Update: I realized that even a non-grouping count also runs out of memory:

SELECT (count(*) as ?count) where {
  ?s1 osmm:key ?osmt.
  ?s2 ?osmt ?v.
}
Parse Tree
QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)

Original AST
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX sesame: <http://www.openrdf.org/schema/sesame#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX fn: <http://www.w3.org/2005/xpath-functions#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX hint: <http://www.bigdata.com/queryHints#> PREFIX bd: <http://www.bigdata.com/rdf#> PREFIX bds: <http://www.bigdata.com/rdf/search#> PREFIX osmroot: <https://www.openstreetmap.org> PREFIX osmnode: <https://www.openstreetmap.org/node/> PREFIX osmway: <https://www.openstreetmap.org/way/> PREFIX osmrel: <https://www.openstreetmap.org/relation/> PREFIX osmm: <https://www.openstreetmap.org/meta/> PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:> PREFIX pageviews: <https://dumps.wikimedia.org/other/pageviews/> PREFIX osmd: <http://wiki.openstreetmap.org/entity/> PREFIX osmdt: <http://wiki.openstreetmap.org/prop/direct/> PREFIX osmds: <http://wiki.openstreetmap.org/entity/statement/> PREFIX osmp: <http://wiki.openstreetmap.org/prop/> PREFIX osmdref: <http://wiki.openstreetmap.org/reference/> PREFIX osmdv: <http://wiki.openstreetmap.org/value/> PREFIX osmps: <http://wiki.openstreetmap.org/prop/statement/> PREFIX osmpsv: <http://wiki.openstreetmap.org/prop/statement/value/> PREFIX osmpsn: <http://wiki.openstreetmap.org/prop/statement/value-normalized/> PREFIX osmpq: <http://wiki.openstreetmap.org/prop/qualifier/> PREFIX osmpqv: <http://wiki.openstreetmap.org/prop/qualifier/value/> PREFIX osmpqn: <http://wiki.openstreetmap.org/prop/qualifier/value-normalized/> PREFIX osmpr: <http://wiki.openstreetmap.org/prop/reference/> PREFIX osmprv: <http://wiki.openstreetmap.org/prop/reference/value/> PREFIX osmprn: <http://wiki.openstreetmap.org/prop/reference/value-normalized/> PREFIX osmdno: <http://wiki.openstreetmap.org/prop/novalue/> PREFIX osmdata: <http://wiki.openstreetmap.org/wiki/Special:EntityData/> PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wds: <http://www.wikidata.org/entity/statement/> PREFIX p: <http://www.wikidata.org/prop/> PREFIX wdref: <http://www.wikidata.org/reference/> PREFIX wdv: <http://www.wikidata.org/value/> PREFIX ps: <http://www.wikidata.org/prop/statement/> PREFIX psv: <http://www.wikidata.org/prop/statement/value/> PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/> PREFIX pq: <http://www.wikidata.org/prop/qualifier/> PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/> PREFIX pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/> PREFIX pr: <http://www.wikidata.org/prop/reference/> PREFIX prv: <http://www.wikidata.org/prop/reference/value/> PREFIX prn: <http://www.wikidata.org/prop/reference/value-normalized/> PREFIX wdno: <http://www.wikidata.org/prop/novalue/> PREFIX wdata: <http://www.wikidata.org/wiki/Special:EntityData/> PREFIX wdtn: <http://wiki.openstreetmap.org/prop/direct-normalized/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX schema: <http://schema.org/> PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/geosparql/function/> PREFIX mediawiki: <https://www.mediawiki.org/ontology#> PREFIX mwapi: <https://www.mediawiki.org/ontology#API/> PREFIX gas: <http://www.bigdata.com/rdf/gas#> PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#> PREFIX dct: <http://purl.org/dc/terms/> QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }

Optimized AST
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX sesame: <http://www.openrdf.org/schema/sesame#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX fn: <http://www.w3.org/2005/xpath-functions#> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX hint: <http://www.bigdata.com/queryHints#> PREFIX bd: <http://www.bigdata.com/rdf#> PREFIX bds: <http://www.bigdata.com/rdf/search#> PREFIX osmroot: <https://www.openstreetmap.org> PREFIX osmnode: <https://www.openstreetmap.org/node/> PREFIX osmway: <https://www.openstreetmap.org/way/> PREFIX osmrel: <https://www.openstreetmap.org/relation/> PREFIX osmm: <https://www.openstreetmap.org/meta/> PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:> PREFIX pageviews: <https://dumps.wikimedia.org/other/pageviews/> PREFIX osmd: <http://wiki.openstreetmap.org/entity/> PREFIX osmdt: <http://wiki.openstreetmap.org/prop/direct/> PREFIX osmds: <http://wiki.openstreetmap.org/entity/statement/> PREFIX osmp: <http://wiki.openstreetmap.org/prop/> PREFIX osmdref: <http://wiki.openstreetmap.org/reference/> PREFIX osmdv: <http://wiki.openstreetmap.org/value/> PREFIX osmps: <http://wiki.openstreetmap.org/prop/statement/> PREFIX osmpsv: <http://wiki.openstreetmap.org/prop/statement/value/> PREFIX osmpsn: <http://wiki.openstreetmap.org/prop/statement/value-normalized/> PREFIX osmpq: <http://wiki.openstreetmap.org/prop/qualifier/> PREFIX osmpqv: <http://wiki.openstreetmap.org/prop/qualifier/value/> PREFIX osmpqn: <http://wiki.openstreetmap.org/prop/qualifier/value-normalized/> PREFIX osmpr: <http://wiki.openstreetmap.org/prop/reference/> PREFIX osmprv: <http://wiki.openstreetmap.org/prop/reference/value/> PREFIX osmprn: <http://wiki.openstreetmap.org/prop/reference/value-normalized/> PREFIX osmdno: <http://wiki.openstreetmap.org/prop/novalue/> PREFIX osmdata: <http://wiki.openstreetmap.org/wiki/Special:EntityData/> PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX wds: <http://www.wikidata.org/entity/statement/> PREFIX p: <http://www.wikidata.org/prop/> PREFIX wdref: <http://www.wikidata.org/reference/> PREFIX wdv: <http://www.wikidata.org/value/> PREFIX ps: <http://www.wikidata.org/prop/statement/> PREFIX psv: <http://www.wikidata.org/prop/statement/value/> PREFIX psn: <http://www.wikidata.org/prop/statement/value-normalized/> PREFIX pq: <http://www.wikidata.org/prop/qualifier/> PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/> PREFIX pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/> PREFIX pr: <http://www.wikidata.org/prop/reference/> PREFIX prv: <http://www.wikidata.org/prop/reference/value/> PREFIX prn: <http://www.wikidata.org/prop/reference/value-normalized/> PREFIX wdno: <http://www.wikidata.org/prop/novalue/> PREFIX wdata: <http://www.wikidata.org/wiki/Special:EntityData/> PREFIX wdtn: <http://wiki.openstreetmap.org/prop/direct-normalized/> PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX schema: <http://schema.org/> PREFIX prov: <http://www.w3.org/ns/prov#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX geo: <http://www.opengis.net/ont/geosparql#> PREFIX geof: <http://www.opengis.net/def/geosparql/function/> PREFIX mediawiki: <https://www.mediawiki.org/ontology#> PREFIX mwapi: <https://www.mediawiki.org/ontology#API/> PREFIX gas: <http://www.bigdata.com/rdf/gas#> PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#> PREFIX dct: <http://purl.org/dc/terms/> QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode(*))[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT(*)] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }

Query Plan
com.bigdata.bop.rdf.join.ChunkedMaterializationOp[7](ProjectionOp[6])[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp[6](PipelinedAggregationOp[5])[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp[5](PipelineJoin[4])[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT(*)=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin[4](PipelineJoin[2])[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]

Query Evaluation Statistics

evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio
total total 338682 75290133 39846393 0 0.52923791488056
0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 13710 1 4661 0 4661
1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6652258130 237526 3500 39841732 0 11383.352
2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} 87446 75286732 0 0 0
3 ProjectionOp[6] [count] 0 0 0 0 N/A
4 ChunkedMaterializationOp[7] vars=[count],materializeInlineIVs=true 0 0 0 0 N/A
thompsonbry commented 5 years ago

What is the Java heap? Direct memory assigned?

What is the error that you receive with these queries? Is it the same message? Please include the stack traces.

Blazegraph uses both the Java managed object heap and native heap. However, the amounts of the heap available to Blazegraph are determined by how you start the JVM. Thus, you could be running out of either the native heap or the JVM managed heap, depending on the query. Blazegraph also has an analytic mode and a non-analytic mode. These differ in whether or not the underlying operators target the native heap (analytic) or the managed heap (otherwise).

This is the code which should be handling the query plan above.

https://github.com/blazegraph/database/blob/master/bigdata-core/bigdata/src/java/com/bigdata/bop/solutions/PipelinedAggregationOp.java

Bryan

On Fri, Dec 7, 2018 at 9:09 AM Yuri Astrakhan notifications@github.com wrote:

Update: I realized that even a non-grouping count also runs out of memory:

SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }

Parse Tree QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)

Original AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }

Optimized AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }

Query Plan com.bigdata.bop.rdf.join.ChunkedMaterializationOp7[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp6[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp5[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT())],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT()=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin4[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]

Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio
total total 338682 75290133 39846393 0 0.52923791488056
0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[
https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 13710 1
4661 0 4661
1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6652258130
237526 3500 39841732 0 11383.352
2 PipelinedAggregationOp[5]

GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} | | | 87446 | 75286732 | 0 | 0 | 0 3 | ProjectionOp[6] | [count] | | | 0 | 0 | 0 | 0 | N/A 4 | ChunkedMaterializationOp[7] | vars=[count],materializeInlineIVs=true | | | 0 | 0 | 0 | 0 | N/A

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445299771, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DQ9QDBePB1wnQV8oismVFI80-VYks5u2qDUgaJpZM4Y_WIs .

thompsonbry commented 5 years ago

In terms of the explain, you can see that the joins are exploding by 4600x and then by 11000x. You have 39M solutions flowing into the pipelined aggregation operator at this point in time. The range estimate for your last triple pattern (?s2 ?osmt ?v) is 6 billion. That range estimate is taken with nothing bound (they are obtained before the query runs and that triple pattern is all variables). I can't tell from this how many distinct values for ?osmt were observed before this hit the wall. If it is the 45000 you are predicting, then I am pretty curious why it is hitting a memory wall unless the JVM configuration is way off.

Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio


total total 338,682 75,290,133 39,846,393 - 0.53 0 https://www.openstreetmap.org/meta/key PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[https://www.openstreetmap.org/meta/key], ?osmt) 2 4,661 13,710 1 4,661 - 4,661.00 1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6,652,258,130 237,526 3,500 39,841,732 - 11,383.35 2 PipelinedAggregationOp[5] GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} 87,446 75,286,732


3 ProjectionOp[6] [count] -

On Fri, Dec 7, 2018 at 10:37 AM Bryan Thompson bryan@blazegraph.com wrote:

What is the Java heap? Direct memory assigned?

What is the error that you receive with these queries? Is it the same message? Please include the stack traces.

Blazegraph uses both the Java managed object heap and native heap. However, the amounts of the heap available to Blazegraph are determined by how you start the JVM. Thus, you could be running out of either the native heap or the JVM managed heap, depending on the query. Blazegraph also has an analytic mode and a non-analytic mode. These differ in whether or not the underlying operators target the native heap (analytic) or the managed heap (otherwise).

This is the code which should be handling the query plan above.

https://github.com/blazegraph/database/blob/master/bigdata-core/bigdata/src/java/com/bigdata/bop/solutions/PipelinedAggregationOp.java

Bryan

On Fri, Dec 7, 2018 at 9:09 AM Yuri Astrakhan notifications@github.com wrote:

Update: I realized that even a non-grouping count also runs out of memory:

SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }

Parse Tree QueryContainer SelectQuery Select ProjectionElem Count Var (count) WhereClause GraphPatternGroup BasicGraphPattern TriplesSameSubjectPath Var (s1) PropertyListPath PathAlternative PathSequence PathElt IRI (https://www.openstreetmap.org/meta/key) ObjectList Var (osmt) TriplesSameSubjectPath Var (s2) PropertyListPath Var (osmt) ObjectList Var (v)

Original AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] }

Optimized AST PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX sesame: http://www.openrdf.org/schema/sesame# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX fn: http://www.w3.org/2005/xpath-functions# PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX hint: http://www.bigdata.com/queryHints# PREFIX bd: http://www.bigdata.com/rdf# PREFIX bds: http://www.bigdata.com/rdf/search# PREFIX osmroot: https://www.openstreetmap.org PREFIX osmnode: https://www.openstreetmap.org/node/ PREFIX osmway: https://www.openstreetmap.org/way/ PREFIX osmrel: https://www.openstreetmap.org/relation/ PREFIX osmm: https://www.openstreetmap.org/meta/ PREFIX osmt: https://wiki.openstreetmap.org/wiki/Key: PREFIX pageviews: https://dumps.wikimedia.org/other/pageviews/ PREFIX osmd: http://wiki.openstreetmap.org/entity/ PREFIX osmdt: http://wiki.openstreetmap.org/prop/direct/ PREFIX osmds: http://wiki.openstreetmap.org/entity/statement/ PREFIX osmp: http://wiki.openstreetmap.org/prop/ PREFIX osmdref: http://wiki.openstreetmap.org/reference/ PREFIX osmdv: http://wiki.openstreetmap.org/value/ PREFIX osmps: http://wiki.openstreetmap.org/prop/statement/ PREFIX osmpsv: http://wiki.openstreetmap.org/prop/statement/value/ PREFIX osmpsn: http://wiki.openstreetmap.org/prop/statement/value-normalized/ PREFIX osmpq: http://wiki.openstreetmap.org/prop/qualifier/ PREFIX osmpqv: http://wiki.openstreetmap.org/prop/qualifier/value/ PREFIX osmpqn: http://wiki.openstreetmap.org/prop/qualifier/value-normalized/ PREFIX osmpr: http://wiki.openstreetmap.org/prop/reference/ PREFIX osmprv: http://wiki.openstreetmap.org/prop/reference/value/ PREFIX osmprn: http://wiki.openstreetmap.org/prop/reference/value-normalized/ PREFIX osmdno: http://wiki.openstreetmap.org/prop/novalue/ PREFIX osmdata: http://wiki.openstreetmap.org/wiki/Special:EntityData/ PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/ PREFIX wds: http://www.wikidata.org/entity/statement/ PREFIX p: http://www.wikidata.org/prop/ PREFIX wdref: http://www.wikidata.org/reference/ PREFIX wdv: http://www.wikidata.org/value/ PREFIX ps: http://www.wikidata.org/prop/statement/ PREFIX psv: http://www.wikidata.org/prop/statement/value/ PREFIX psn: http://www.wikidata.org/prop/statement/value-normalized/ PREFIX pq: http://www.wikidata.org/prop/qualifier/ PREFIX pqv: http://www.wikidata.org/prop/qualifier/value/ PREFIX pqn: http://www.wikidata.org/prop/qualifier/value-normalized/ PREFIX pr: http://www.wikidata.org/prop/reference/ PREFIX prv: http://www.wikidata.org/prop/reference/value/ PREFIX prn: http://www.wikidata.org/prop/reference/value-normalized/ PREFIX wdno: http://www.wikidata.org/prop/novalue/ PREFIX wdata: http://www.wikidata.org/wiki/Special:EntityData/ PREFIX wdtn: http://wiki.openstreetmap.org/prop/direct-normalized/ PREFIX wikibase: http://wikiba.se/ontology# PREFIX schema: http://schema.org/ PREFIX prov: http://www.w3.org/ns/prov# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX geo: http://www.opengis.net/ont/geosparql# PREFIX geof: http://www.opengis.net/def/geosparql/function/ PREFIX mediawiki: https://www.mediawiki.org/ontology# PREFIX mwapi: https://www.mediawiki.org/ontology#API/ PREFIX gas: http://www.bigdata.com/rdf/gas# PREFIX ontolex: http://www.w3.org/ns/lemon/ontolex# PREFIX dct: http://purl.org/dc/terms/ QueryType: SELECT includeInferred=true timeout=600000 SELECT ( com.bigdata.rdf.sparql.ast.FunctionNode(VarNode())[ FunctionNode.scalarVals=null, FunctionNode.functionURI=http://www.w3.org/2006/sparql-functions#count, valueExpr=com.bigdata.bop.rdf.aggregate.COUNT()] AS VarNode(count) ) JoinGroupNode { StatementPatternNode(VarNode(s1), ConstantNode(TermId(119468018U)[https://www.openstreetmap.org/meta/key]), VarNode(osmt)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=4661 AST2BOpBase.originalIndex=POS StatementPatternNode(VarNode(s2), VarNode(osmt), VarNode(v)) [scope=DEFAULT_CONTEXTS] AST2BOpBase.estimatedCardinality=6652258130 AST2BOpBase.originalIndex=SPO }

Query Plan com.bigdata.bop.rdf.join.ChunkedMaterializationOp7[ ChunkedMaterializationOp.vars=[count], IPredicate.relationName=[wdq.lex], IPredicate.timestamp=1544201488993, ChunkedMaterializationOp.materializeAll=true, PipelineOp.sharedState=true, BOp.bopId=7, BOp.timeout=600000, BOp.namespace=wdq, QueryEngine.queryId=3d016634-f3d7-4320-a3d4-7dea1f1f660f, QueryEngine.chunkHandler=com.bigdata.bop.engine.NativeHeapStandloneChunkHandler@44c109cc] com.bigdata.bop.solutions.ProjectionOp6[ BOp.bopId=6, BOp.evaluationContext=CONTROLLER, PipelineOp.sharedState=true, JoinAnnotations.select=[count]] com.bigdata.bop.solutions.PipelinedAggregationOp5[ BOp.bopId=5, BOp.evaluationContext=CONTROLLER, PipelineOp.pipelined=true, PipelineOp.maxParallel=1, PipelineOp.sharedState=true, GroupByOp.groupByState=GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT())],groupBy=null,having=null}, GroupByOp.groupByRewrite=GroupByRewriter{aggExpr={com.bigdata.bop.rdf.aggregate.COUNT()=42c39291-b6dd-4410-9c5e-4067e89c2cab},select2=[com.bigdata.bop.Bind(count,42c39291-b6dd-4410-9c5e-4067e89c2cab)],having2=null}, PipelineOp.lastPass=true] com.bigdata.bop.join.PipelineJoin4[ BOp.bopId=4, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[3](s2=null, osmt=null, v=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=3, AST2BOpBase.estimatedCardinality=6652258130, AST2BOpBase.originalIndex=SPO, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]] com.bigdata.bop.join.PipelineJoin[2]()[ BOp.bopId=2, JoinAnnotations.constraints=null, AST2BOpBase.simpleJoin=true, BOp.evaluationContext=ANY, AccessPathJoinAnnotations.predicate=com.bigdata.rdf.spo.SPOPredicate[1](s1=null, TermId(119468018U)[https://www.openstreetmap.org/meta/key], osmt=null)[ IPredicate.relationName=[wdq.spo], IPredicate.timestamp=1544201488993, BOp.bopId=1, AST2BOpBase.estimatedCardinality=4661, AST2BOpBase.originalIndex=POS, IPredicate.flags=[KEYS,VALS,READONLY,PARALLEL]]]

Query Evaluation Statistics evalOrder bopSummary predSummary nvars fastRangeCount sumMillis unitsIn unitsOut typeErrors joinRatio
total total 338682 75290133 39846393 0 0.52923791488056
0 PipelineJoin[2] SPOPredicate[1](?s1, TermId(119468018U)[
https://www.openstreetmap.org/meta/key], ?osmt) 2 4661 13710 1
4661 0 4661
1 PipelineJoin[4] SPOPredicate[3](?s2, ?osmt, ?v) 3 6652258130
237526 3500 39841732 0 11383.352
2 PipelinedAggregationOp[5]

GroupByState{select=[com.bigdata.bop.Bind(count,com.bigdata.bop.rdf.aggregate.COUNT(*))],groupBy=null,having=null} | | | 87446 | 75286732 | 0 | 0 | 0 3 | ProjectionOp[6] | [count] | | | 0 | 0 | 0 | 0 | N/A 4 | ChunkedMaterializationOp[7] | vars=[count],materializeInlineIVs=true | | | 0 | 0 | 0 | 0 | N/A

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445299771, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DQ9QDBePB1wnQV8oismVFI80-VYks5u2qDUgaJpZM4Y_WIs .

nyurik commented 5 years ago

openjdk version "1.8.0_181", with wikibase customizations, ran with these params:

java 
  -server 
  -XX:+UseG1GC 
  -Xmx12288m 
  -Xloggc:/var/log/wdqs/wdqs-blazegraph_jvm_gc.%p-%t.log 
  -XX:+PrintGCDetails 
  -XX:+PrintGCDateStamps 
  -XX:+PrintGCTimeStamps 
  -XX:+PrintAdaptiveSizePolicy 
  -XX:+PrintReferenceGC 
  -XX:+PrintGCCause 
  -XX:+PrintGCApplicationStoppedTime 
  -XX:+PrintTenuringDistribution 
  -XX:+UnlockExperimentalVMOptions 
  -XX:G1NewSizePercent=20 
  -XX:+ParallelRefProcEnabled 
  -XX:+UseGCLogFileRotation 
  -XX:NumberOfGCLogFiles=10 
  -XX:GCLogFileSize=20M 
  -Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=RWStore.properties 
  -Dorg.eclipse.jetty.server.Request.maxFormContentSize=200000000 
  -Dcom.bigdata.rdf.sparql.ast.QueryHints.analytic=true 
  -Dcom.bigdata.rdf.sparql.ast.QueryHints.analyticMaxMemoryPerQuery=1073741824 
  -DASTOptimizerClass=org.wikidata.query.rdf.blazegraph.WikibaseOptimizers 
  -Dorg.wikidata.query.rdf.blazegraph.inline.literal.WKTSerializer.noGlobe=2 
  -Dcom.bigdata.rdf.sail.webapp.client.RemoteRepository.maxRequestURLLength=7168 
  -Dcom.bigdata.rdf.sail.sparql.PrefixDeclProcessor.additionalDeclsFile=./prefixes.conf 
  -Dorg.wikidata.query.rdf.blazegraph.mwapi.MWApiServiceFactory.config=./mwservices.json 
  -Dcom.bigdata.rdf.sail.webapp.client.HttpClientConfigurator=org.wikidata.query.rdf.blazegraph.ProxiedHttpConnectionFactory 
  -Dhttp.userAgent=Sophox - OSM Query Service; https://sophox.org/ 
  -Dorg.eclipse.jetty.annotations.AnnotationParser.LEVEL=OFF 
  -DwikibaseConceptUri=http://wiki.openstreetmap.org 
  -DwikibaseServiceEnableWhitelist=false 
  -jar jetty-runner-9.4.12.v20180830.jar 
  --host 0.0.0.0 
  --port 9999 
  --path /bigdata blazegraph-service-0.3.1-SNAPSHOT.war

Exception info:

SPARQL-QUERY: queryStr=SELECT (count(*) as ?count) where {
  ?s1 osmm:key ?osmt.
  ?s2 ?osmt ?v.
}

java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:206)
    at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:679)
    at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290)
    at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
    at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
    at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:337)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
    at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
    at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
    at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
    at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
    at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.Server.handle(Server.java:503)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
    at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:890)
    at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:696)
    at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more
Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188)
    at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68)
    at org.openrdf.query.QueryResults.report(QueryResults.java:155)
    at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1713)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1569)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1534)
    at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:747)
    ... 4 more
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:59)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.close(RunningQueryCloseableIterator.java:73)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.hasNext(RunningQueryCloseableIterator.java:82)
    at com.bigdata.striterator.ChunkedWrappedIterator.hasNext(ChunkedWrappedIterator.java:197)
    at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134)
    ... 11 more
Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.util.concurrent.Haltable.get(Haltable.java:273)
    at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:1516)
    at com.bigdata.bop.engine.AbstractRunningQuery.get(AbstractRunningQuery.java:104)
    at com.bigdata.rdf.sail.RunningQueryCloseableIterator.checkFuture(RunningQueryCloseableIterator.java:46)
    ... 15 more
Caused by: java.lang.Exception: task=ChunkTask{query=6a62db54-fbfb-4244-bcf2-60fde0a9e140,bopId=4,partitionId=-1,sinkId=5,altSinkId=null}, cause=java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1367)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:926)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:821)
    ... 3 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1347)
    ... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:682)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:382)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63)
    at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTask.call(ChunkedRunningQuery.java:1346)
    ... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1027)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.consumeSource(PipelineJoin.java:739)
    at com.bigdata.bop.join.PipelineJoin$JoinTask.call(PipelineJoin.java:623)
    ... 12 more
Caused by: java.lang.RuntimeException: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1961)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.call(PipelineJoin.java:1684)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.executeTasks(PipelineJoin.java:1392)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$BindingSetConsumerTask.call(PipelineJoin.java:1016)
    ... 14 more
Caused by: com.bigdata.rwstore.sector.MemoryManagerOutOfMemory
    at com.bigdata.rwstore.sector.MemoryManager.getSectorFromFreeList(MemoryManager.java:646)
    at com.bigdata.rwstore.sector.MemoryManager.allocate(MemoryManager.java:675)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:195)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:169)
    at com.bigdata.rwstore.sector.AllocationContext.allocate(AllocationContext.java:159)
    at com.bigdata.rwstore.sector.AllocationContext.alloc(AllocationContext.java:359)
    at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:335)
    at com.bigdata.rwstore.PSOutputStream.getAddr(PSOutputStream.java:416)
    at com.bigdata.bop.solutions.SolutionSetStream.put(SolutionSetStream.java:297)
    at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:213)
    at com.bigdata.bop.engine.LocalNativeChunkMessage.<init>(LocalNativeChunkMessage.java:147)
    at com.bigdata.bop.engine.StandaloneChunkHandler.handleChunk(StandaloneChunkHandler.java:92)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.outputChunk(ChunkedRunningQuery.java:1699)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.addReorderAllowed(ChunkedRunningQuery.java:1628)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1569)
    at com.bigdata.bop.engine.ChunkedRunningQuery$HandleChunkBuffer.add(ChunkedRunningQuery.java:1453)
    at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:59)
    at com.bigdata.relation.accesspath.UnsyncLocalOutputBuffer.handleChunk(UnsyncLocalOutputBuffer.java:14)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.overflow(AbstractUnsynchronizedArrayBuffer.java:287)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add2(AbstractUnsynchronizedArrayBuffer.java:215)
    at com.bigdata.relation.accesspath.AbstractUnsynchronizedArrayBuffer.add(AbstractUnsynchronizedArrayBuffer.java:173)
    at com.bigdata.bop.join.PipelineJoin$JoinTask$AccessPathTask.handleJoin2(PipelineJoin.java:1868)
    ... 17 more
nyurik commented 5 years ago

Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.

SELECT (count(distinct ?osmt) as ?count) where {
  ?s1 osmm:key ?osmt.
}
thompsonbry commented 5 years ago

It is out of native memory. The error is from the memory manager. I do not see the option to grant the JVM the ability to use native memory ? Just an option to limit it to 1G native memory per query.

@stas: could you make sure that the best practices are captured for this and for diagnosing the OOM as Java heap, GC OH exceeded, or native heap?

Thanks, Bryan

On Fri, Dec 7, 2018 at 11:25 AM Yuri Astrakhan notifications@github.com wrote:

Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.

SELECT (count(distinct ?osmt) as ?count) where { ?s1 osmm:key ?osmt. }

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445338947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4OMA6KuHwxyABjVHmnVMcj_6tA2Gks5u2sC8gaJpZM4Y_WIs .

thompsonbry commented 5 years ago

There might be very little native memory available by default. This could lead me to an OOM of the native heap unless you explicitly allow direct memory to be used by the JVM

On Fri, Dec 7, 2018 at 11:52 AM Bryan Thompson bryan@blazegraph.com wrote:

It is out of native memory. The error is from the memory manager. I do not see the option to grant the JVM the ability to use native memory ? Just an option to limit it to 1G native memory per query.

@stas: could you make sure that the best practices are captured for this and for diagnosing the OOM as Java heap, GC OH exceeded, or native heap?

Thanks, Bryan

On Fri, Dec 7, 2018 at 11:25 AM Yuri Astrakhan notifications@github.com wrote:

Also, I was mistaken - distinct ?osmt is only 4,661, not 45,000 i initially thought, making this issue even stranger.

SELECT (count(distinct ?osmt) as ?count) where { ?s1 osmm:key ?osmt. }

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445338947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4OMA6KuHwxyABjVHmnVMcj_6tA2Gks5u2sC8gaJpZM4Y_WIs .

nyurik commented 5 years ago

Bryan, thank you for all your help! The java runtime configuration is being set in runBlazegraph.sh script, part of Wikibase rdf repo. I only set the heap memory (-Xmx) CC: @smalyshev

According to this article (point 2) about DirectByteBuffer native memory. Not sure if that's the same thing.

By default, it’s equal to -Xmx. Yes, the JVM heap and off-heap memory are two different memory areas, but by default, they have the same maximum size.

BTW, all this is for Sophox, a pro-bono service for OpenStreetMap to query all of OSM data, metadata, wikipedia pageview stats, etc. Accessible via https://sophox.org

thompsonbry commented 5 years ago

Use this option to be sure.

-

The limit can be changed using -XX:MaxDirectMemorySize property. This property accepts acronyms like “g” or “G” for gigabytes, etc.

On Fri, Dec 7, 2018 at 12:41 PM Yuri Astrakhan notifications@github.com wrote:

Bryan, thank you for all your help! The java runtime configuration is being set in runBlazegraph.sh https://github.com/wikimedia/wikidata-query-rdf/blob/master/dist/src/script/runBlazegraph.sh script, part of Wikibase rdf repo. I only set the heap memory (-Xmx) CC: @smalyshev https://github.com/smalyshev

According to this article https://dzone.com/articles/troubleshooting-problems-with-native-off-heap-memo (point 2) about DirectByteBuffer native memory. Not sure if that's the same thing.

By default, it’s equal to -Xmx. Yes, the JVM heap and off-heap memory are two different memory areas, but by default, they have the same maximum size.

BTW, all this is for Sophox https://wiki.openstreetmap.org/wiki/Sophox, a pro-bono service for OpenStreetMap to query all of OSM data, metadata, wikipedia pageview stats, etc. Accessible via https://sophox.org

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445359169, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4NUdm1CCGytee95KZO8Uc8IoXlqJks5u2tJmgaJpZM4Y_WIs .

nyurik commented 5 years ago

I just re-ran the same query with -XX:MaxDirectMemorySize=40g, and got exactly the same error (same stack trace), after about the same ~5 min wait. I even tried 70g, same result. BTW, not sure if that's related -- every minute the server is updated with a few MBs of changes - could that affect the issue?

thompsonbry commented 5 years ago

Did you also raise the max direct memory per query? It was set at 1G as I recall.

On Fri, Dec 7, 2018 at 6:55 PM Yuri Astrakhan notifications@github.com wrote:

I just re-ran the same query with -XX:MaxDirectMemorySize=40g, and got exactly the same error (same stack trace), after about the same ~5 min wait. I even tried 70g, same result. BTW, not sure if that's related -- every minute the server is updated with a few MBs of changes - could that affect the issue?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445424566, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4KkyFcuqj7QwqeZwWKpuOPs1UCH4ks5u2yo0gaJpZM4Y_WIs .

nyurik commented 5 years ago

With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long or require so much memory, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt. produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!

SELECT (count(*) as ?count) where {
  ?s1   osmm:key   ?osmt.
  ?s2   ?osmt   ?v.
}
thompsonbry commented 5 years ago

It is likely enqueing a lot of data in memory during the query. You might try looking at the JVM heap and the number of native buffers used (/status and performance counters). Wiki data is setup to prefer native memory use over java memory use to provide tighter controls on the memory used (DOS concerns). That can impose higher overhead during the query.

Michael, are you aware of anything which would hold onto the memory for this query? I seem to recall you have looked at a few possible query scope memory leaks.

Bryan

On Sat, Dec 8, 2018 at 7:26 PM Yuri Astrakhan notifications@github.com wrote:

With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt. produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!

SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445507947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HsbpPAXWrEKyX4UmL3uvMOY2hbtks5u3IMCgaJpZM4Y_WIs .

--

Bryan Thompson Chief Scientist & Founder Blazegraph e: bryan@blazegraph.com w: http://blazegraph.com

Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” http://insideanalysis.com/2016/01/20535/.

Blazegraph Database https://www.blazegraph.com/ is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU https://www.blazegraph.com/product/gpu-accelerated/ andBlazegraph DAS https://www.blazegraph.com/product/gpu-accelerated/L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions.

CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments.

nyurik commented 5 years ago

Thanks Bryan, I will run it a few more times tomorrow. Would there be any security/privacy/stability issues if I allow world's access to the /bigdata (GET only) ? This way you could take a look at them directly, and perhaps we could write up some docs on perf optimization.

thompsonbry commented 5 years ago

There are docs on perf optimization on the wiki.

https://wiki.blazegraph.com/wiki/index.php/Main_Page

Bryan

On Sat, Dec 8, 2018 at 7:55 PM Yuri Astrakhan notifications@github.com wrote:

Thanks Bryan, I will run it a few more times tomorrow. Would there be any security/privacy/stability issues if I allow world's access to the /bigdata (GET only) ? This way you could take a look at them directly, and perhaps we could write up some docs on perf optimization.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445509107, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4DzNH9VpfHQCLNjpwjCMbu-DAzRLks5u3ImzgaJpZM4Y_WIs .

thompsonbry commented 5 years ago

Not really, this should run with more or less constant memory consumption -- it should all be pipelined through, with limited size queues in front of the operators. You could look at query explain to understand the dynamics, but it would probably not tell you much about where/whether things are blowing up. Agree with Bryan, performance counters are probably most promising (or a profiler, if possible)...

Am Sa., 8. Dez. 2018 um 19:37 Uhr schrieb Bryan Thompson bryan@systap.com:

It is likely enqueing a lot of data in memory during the query. You might try looking at the JVM heap and the number of native buffers used (/status and performance counters). Wiki data is setup to prefer native memory use over java memory use to provide tighter controls on the memory used (DOS concerns). That can impose higher overhead during the query.

Michael, are you aware of anything which would hold onto the memory for this query? I seem to recall you have looked at a few possible query scope memory leaks.

Bryan

On Sat, Dec 8, 2018 at 7:26 PM Yuri Astrakhan notifications@github.com wrote:

With the higher per-query limit (5G), I was able to get a query timeout (5 minutes) instead of OOM. Yet I was hoping this kind of query would not take this long, despite processing a large amount of data. In theory, for the query below, ?s1 osmm:key ?osmt. produces a list of 4,500 items. In my non-expert understanding, Blazegraph would then perform 4500 individual queries into the predicate index, finding first/last position of each predicate to compute the number of items per predicate, and add them all together. I would have thought this would be a relatively fast and non-memory intensive operation... So either my thoughts are naive (likely!), or there is a way to optimize the query or the Blazegraph code...? Thank you for all the help and the very thorough explanations!

SELECT (count(*) as ?count) where { ?s1 osmm:key ?osmt. ?s2 ?osmt ?v. }

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/blazegraph/database/issues/108#issuecomment-445507947, or mute the thread https://github.com/notifications/unsubscribe-auth/ACdv4HsbpPAXWrEKyX4UmL3uvMOY2hbtks5u3IMCgaJpZM4Y_WIs .

--

Bryan Thompson Chief Scientist & Founder Blazegraph e: bryan@blazegraph.com w: http://blazegraph.com

Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” http://insideanalysis.com/2016/01/20535/.

Blazegraph Database https://www.blazegraph.com/ is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU https://www.blazegraph.com/product/gpu-accelerated/ andBlazegraph DAS https://www.blazegraph.com/product/gpu-accelerated/L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions.

CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments.

radirk commented 2 years ago

I got the same com.bigdata.rwstore.sector.MemoryManagerOutOfMemory counting about 51 million records.

Using a subquery with a very large limit (larger than the expected count obviously) lets the query run without this error (I have no timeout set):

PREFIX vrank:<http://purl.org/voc/vrank#>
SELECT (COUNT(?b) as ?c) WHERE { 
     SELECT ?b {?a vrank:hasRank/vrank:rankValue ?b .}
            LIMIT 100000000 #one hundred million
}

Maybe it helps someone.