SteelBridgeLabs / neo4j-gremlin-bolt

Apache License 2.0
0 stars 1 forks source link

Touching graph.V or graph.E in any way brings all vertices/edges into memory #70

Closed apatzer closed 6 years ago

apatzer commented 6 years ago

I have Neo4J running on localhost and connecting via Bolt.

val driver = GraphDatabase.driver("bolt://localhost:7687", AuthTokens.basic(_, _))
val idProvider = new Neo4JNativeElementIdProvider() //DatabaseSequenceElementIdProvider(driver)
val graph: Neo4JGraph = new Neo4JGraph(driver, idProvider, idProvider)

I've created 10,000 vertices and am doing some simple queries

graph.execute("MATCH (n) RETURN count(*)").single() // Takes 1-3ms locally
graph.V.count().head // Takes 500-700ms locally

Executing a Cypher statement directly is super fast. However, if I dive through the code, graph.V.count is a traversal with 2-steps. The first step V brings ALL 10,000 vertices across the wire, making it quite slow. Subsequent queries via graph.V.count are fast.

Given that half of all Gremlin queries begin with something like:

graph.V
    .hasLabel("foo")
    .has("bar", targetValue)
    .where(...) 
    .toList()

It seems like the driver is almost always going to load the entire graph across the wire. That doesn't seem very performant, so maybe I'm missing something on the usage?

Would it be better to have traversals locally "compiled" into Cypher query language, then sent across? That's how most JDBC systems like Slick or Hibernate ultimate work...your code is compiled into SQL.

rjbaucells commented 6 years ago

See #46

MohitMehta1986 commented 6 years ago

I am facing same issue. I have below query g.V().has("id", "100300").out("state").values("name"). It is giving 1600 ms to fetch result. But if i execute usng cypher query below
graph.cypher('Match(cp:counterparty)-[s:state]->(cps:counterpartystate) where cp.id="/100300"return cps.name') It is takin 1ms to give result.

Please let me know if we can increase query performance.