Open Krittam opened 5 years ago
@Krittam , what happens when you use next query?:
g.V().has('vid','qwerty').inE().limit(Long.MAX_VALUE).count().next()
@porunov the said query produced this exception.
TimedOutException() at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633) at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741) at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143) at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100) at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82) at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288) at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285) at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69) at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55) at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470) at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285) at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98) at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90) at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113) at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95) at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50) at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83) at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128) at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38) at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200) at java_util_Iterator$next.call(Unknown Source) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117) at Script5.run(Script5.groovy:1) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843) at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548) at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233) at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120) at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
I am using janusgraph 0.2.0. I also tried using spark to run OLAP query but i am using yarn to submit spark jobs on my hadoop cluster and there is not enough documentation available for that. Also in 0.2.0 there are many missing jars which make the problem even worse.
Can you reproduce this issue with JanusGraph 0.3.1?
on JanusGraph 0.3.1 query fails with this exception:
Frame size (124696054) larger than max length (15728640)!
On increasing the storage.cassandra.frame-size-mb
property to 512 the query starts execution but doesn't complete (I waited for around 45 mins before finally giving up on it!)
Also please note that while i tried this with JanusGraph 0.3.1 release, the underlying cassandra db was unchanged (Same as when it was created by the original 0.2.0 release)
Confirming performance issue in 0.3.1
also.
I've created 1 vertex with 1 million incoming vertices. Count of 1 million vertices took 6 seconds.
Then I've created 1 vertex with 16 million incoming vertices. The count couldn't be executed in 5 minutes.
Currently I didn't find a good solution to count edges. Vertex centric indexes don't help in count
operation also.
Changing the storage backend to cql and other properties relevant to cql solved the issue for me. Here are the properties which i used:
storage.backend=cql
storage.cql.keyspace=t_graph
storage.cql.read-consistency-level=ONE
I have a graph in which a few nodes have many incoming edges(Supernode). All the edges are of same type/label. There is a query in which i need to report the total no of incoming edges. I'm using cassandarathrift as storage backend
g.V().has('vid','qwerty').inE().count().next()
This fails withHowever
g.V().has('vid','qwerty').inE().limit(10000).count().next()
gives ==>10000Now if i wanted to filter all the edges based on some condition i would have used vertex centric indexes but I simply want all the incoming edges. The said vertex is expected to have millions of such edges. Please help