SteelBridgeLabs / neo4j-gremlin-bolt

Apache License 2.0
0 stars 1 forks source link

Use of Neo4j indices for HasLabel/Has traversals #46

Open edkan2016 opened 7 years ago

edkan2016 commented 7 years ago

I see that traversals with HasLabel/Has steps are not making use of the existing indices on the nodes. Comparing the traversal plans of this plugin and the Gremlin plugin by Thinkaurelius (https://github.com/thinkaurelius/neo4j-gremlin-plugin), it seems that it is not using a strategy similar to org.apache.tinkerpop.gremlin.neo4j.process.traversal.strategy.optimization.Neo4jGraphStepStrategy to fold the HasLabel/Has conditions into the V() step. Unfortunately, the Thinkaurelius plugin only works for Neo4j 2.x. Is there anything that can be done to incorporate the use of indices into the traversal plan? Thanks!!

rjbaucells commented 7 years ago

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

MohitMehta1986 commented 6 years ago

I am facing same issue. I have below query g.V().has("id", "100300").out("state").values("name"). It is giving 1600 ms to fetch result. But if i execute using cypher query below graph.cypher('Match(cp:counterparty)-[s:state]->(cps:counterpartystate) where cp.id="100300"return cps.name') It is taking 1ms to give result.

Please let me know if we can increase query performance.

I am using gremlin console 3.3.2 version

MohitMehta1986 commented 6 years ago

Tried beow steps

  1. using subgrabh startegy like below g=graph.traversal().withStrategies(SubgraphStrategy.build().vertices(or(hasLabel('counterparty'),hasLabel('counterpartystate'))).edges(hasLabel('state')).create())

  2. using above startegy and below qury gave result in 1ms g.V().dedup().by(id).and(hasLabel('counterparty'),has("id","0100300")).coalesce(out("state")).values("name")

got the reason it is because of using has label and "and" operator. it started using indices

rjbaucells commented 6 years ago

The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

None of the gremlin queries you are issuing will use indexes, all they are doing is loading the graph in memory and doing a client side filter. Look at the documentation on how to enable profiling in this library and you will see this behavior.

MohitMehta1986 commented 6 years ago

When I used normal those are very slow on gremlin console. As soon I used subgraph starteagy and using "has label , "and" operator" got results very fast.

So strategies are contributing in traversal?

Is there any OLAP api implementation in .net which i can reuse?

itzdinsa commented 5 years ago

Hi, I am using this query : 'graph.traversal().V().hasLabel("USER").has("ID", "123123")' But it is taking very long time and using same query in cypher if giving instant result. I have all the necessary index created.

On enabling profiler it is giving me this output:

2018-10-03 16:04:05.598 INFO 7311 --- [nio-8080-exec-1] c.s.o.n.s.summary.ResultSummaryLogger : Profile for CYPHER statement: Statement{text='PROFILE MATCH (n) RETURN n', parameters={}}

+-----------------+----------------+------+---------+-----------+-------+ | Operator + Estimated Rows + Rows + DB Hits + Variables + Other | +-----------------+----------------+------+---------+-----------+-------+ | +ProduceResults | 8 | 8 | 0 | n | | | | +----------------+------+---------+-----------+-------+ | +AllNodesScan | 8 | 8 | 9 | n | | +-----------------+----------------+------+---------+-----------+-------+

Why is gremlin not using indexes. Please help

If I am missing something, please show me the correct way

rjbaucells commented 5 years ago

Same response as before in this thread. This library only implements the Gremlin structure interfaces. You need to use CYPHER like:

Iterator<Vertex> vertices = graph.vertices("MATCH (n:User) WHERE ID(n)={id} RETURN n", Collections.singletonMap("id", 123123));
tanroopdhillon commented 5 years ago

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

You can count me in , in case you are looking for a contributor

rjbaucells commented 5 years ago

Sure, feel free to send PR with implementation. we can discuss it and collaborate on the PR.

tanroopdhillon commented 5 years ago

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

I think it should be a part of OLTP implementation only to make use of indices while using haslabel / has step. It should not be done in OLAP implementation.