JanusGraph / janusgraph

JanusGraph: an open-source, distributed graph database
https://janusgraph.org
Other
5.27k stars 1.16k forks source link

Read time out while fetching all edges of a SuperNode #1467

Open Krittam opened 5 years ago

Krittam commented 5 years ago

I have a graph in which a few nodes have many incoming edges(Supernode). All the edges are of same type/label. There is a query in which i need to report the total no of incoming edges. I'm using cassandarathrift as storage backend g.V().has('vid','qwerty').inE().count().next() This fails with

14108889 [gremlin-server-session-1] WARN org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor - Exception processing a script on request [RequestMessage{, requestId=c3d902b7-0fdd-491d-8639-546963212474, op='eval', processor='session', args={gremlin=g.V().has('vid','qwerty').inE().count().next(), session=2831d264-4566-4d15-99c5-d9bbb202b1f8, bindings={}, manageTransaction=false, batchSize=64}}]. TimedOutException() at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741) at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100) at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82) at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55) at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470) at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285) at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200) at java_util_Iterator$next.call(Unknown Source) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117) at Script13.run(Script13.groovy:1) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548) at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233) at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

However g.V().has('vid','qwerty').inE().limit(10000).count().next() gives ==>10000

Now if i wanted to filter all the edges based on some condition i would have used vertex centric indexes but I simply want all the incoming edges. The said vertex is expected to have millions of such edges. Please help

porunov commented 5 years ago

@Krittam , what happens when you use next query?: g.V().has('vid','qwerty').inE().limit(Long.MAX_VALUE).count().next()

Krittam commented 5 years ago

@porunov the said query produced this exception.

TimedOutException() at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741) at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100) at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82) at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55) at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470) at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285) at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200) at java_util_Iterator$next.call(Unknown Source) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117) at Script5.run(Script5.groovy:1) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548) at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233) at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Krittam commented 5 years ago

I am using janusgraph 0.2.0. I also tried using spark to run OLAP query but i am using yarn to submit spark jobs on my hadoop cluster and there is not enough documentation available for that. Also in 0.2.0 there are many missing jars which make the problem even worse.

porunov commented 5 years ago

Can you reproduce this issue with JanusGraph 0.3.1?

Krittam commented 5 years ago

on JanusGraph 0.3.1 query fails with this exception:

Frame size (124696054) larger than max length (15728640)!

On increasing the storage.cassandra.frame-size-mb property to 512 the query starts execution but doesn't complete (I waited for around 45 mins before finally giving up on it!) Also please note that while i tried this with JanusGraph 0.3.1 release, the underlying cassandra db was unchanged (Same as when it was created by the original 0.2.0 release)

porunov commented 5 years ago

Confirming performance issue in 0.3.1 also. I've created 1 vertex with 1 million incoming vertices. Count of 1 million vertices took 6 seconds. Then I've created 1 vertex with 16 million incoming vertices. The count couldn't be executed in 5 minutes.

porunov commented 5 years ago

Currently I didn't find a good solution to count edges. Vertex centric indexes don't help in count operation also.

kverma12 commented 4 years ago

Changing the storage backend to cql and other properties relevant to cql solved the issue for me. Here are the properties which i used:

storage.backend=cql
storage.cql.keyspace=t_graph
storage.cql.read-consistency-level=ONE