neo4j-contrib / neo4j-tinkerpop-api-impl

Implementation of Apache Licensed Neo4j API for Tinkerpop3
Other
24 stars 16 forks source link

Plans for a Bolt-based api impl? #5

Open dlutz2 opened 7 years ago

dlutz2 commented 7 years ago

Any plans for a Bolt based implementation of neo4j-tinkerpop-api-impl or something similar? I have looked at neo4j-gremlin-bolt but it has some serious issues which I could not resolve.

spmallette commented 7 years ago

our of curiosity - what were the issues with neo4j-gremlin-bolt?

dlutz2 commented 7 years ago

The fundamental thing seems to be that vertex IDs are all null till committed (using their NativeID model), so although it could be used as a loader, many uses of traversals either throw a null pointer exception or gives bizarre answers (since there is effectively only one distinguishable vertex)

spmallette commented 7 years ago

that's not good - didn't know that there was that big a problem.

dlutz2 commented 7 years ago

I was baffled for 3 days thinking I was doing something embarrassingly wrong because I couldnt get past

  Vertex v = graph.addVertex(label);
   g.V(v.id())  or g.V(v)  or g.V(randomString) or g.V(null) ......   ;
jexp commented 7 years ago

@dlutz2 did you talk to the people from SteelBridgeLabs about the issue?

dlutz2 commented 7 years ago

Yes, filed issue https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/issues/52 but it was closed without an actual fix.

rjbaucells commented 7 years ago

This issue is related to the fact that the node/relation identifier is not issued until a CREATE statement is executed on the BOLT driver. Gremlin requires vertex and edge ids to be available on element creation, thing that is not possible if using the neo4j native identifiers.

// create vertex
Vertex v = graph.addVertex(label);
// vertex id() is null until transaction is committed (CREATE statement is executed)
v.id() == null

One solution is to use the client side generated identifiers.

// create vertex
Vertex v = graph.addVertex(label);
// vertex id() is allocated on node/relation creation
v.id() != null
jexp commented 7 years ago

The is is created on statement execution, not commit. So the library could flush create statements if a vertex doesn't have an id yet and then use the available ids.

Or put in a proxy as parameter placeholder which is resolved at execution time of the relationship statement

Von meinem iPhone gesendet

Am 23.04.2017 um 05:41 schrieb Rogelio J. Baucells notifications@github.com:

This issue is related to the fact that the node/relation identifier is not issued until a CREATE statement is executed on the BOLT driver. Gremlin requires vertex and edge ids to be available on element creation, thing that is not possible if using the neo4j native identifiers.

// create vertex Vertex v = graph.addVertex(label); // vertex id() is null until transaction is committed (CREATE statement is executed) v.id() == null One solution is to use the client side generated identifiers.

// create vertex Vertex v = graph.addVertex(label); // vertex id() is allocated on node/relation creation v.id() != null — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

rjbaucells commented 7 years ago

Yes the IDs are generated on CREATE statement execution, not on commit. The library does not send any statements to the database until an explicit transaction commit is executed. There is a problem with flushing CREATE statements into the database, node/relation might not have the required fields enforced at the database level with constraints. The transaction commit is an indication that all required fields have been populated and the library throws an exception if this is not the case.

The library supports gremlin by using the non-native id generation, it passes the Tinkerpop integration tests (StructureStandardSuite and ProcessStandardSuite) without problems.

A possible solution would be by implementing another strategy for ID generation, this strategy could persist the node/relation on creation with the constraint that all required fields must be provided at the time, for example:

// create vertex
Vertex v = graph.addVertex(T.label, "label1", "field1", value1, ..., "fieldn", valuen);

Failing to provide a required field value will throw an exception on Graph.addVertex() and Vertex.addEdge().

dlutz2 commented 7 years ago

That last proposal for another ID generation strategy that threw exceptions if the constraint-required field(s) were not present would work for our scenarios since we don't use schema constraints (we also use the same analytics for non-Neo4j backends which don't have constraints)