neo4jrb / neo4j-core

A simple unified API that can access both the server and embedded Neo4j database. Used by the neo4j gem
MIT License
99 stars 80 forks source link

Request for comments/thoughts/advice on implementing Bolt Routing. #322

Closed phyllisstein closed 4 years ago

phyllisstein commented 5 years ago

šŸ‘‹ Hey folks! I recently migrated my Neo4j architecture to a Causal Cluster, and I've spent the past few days working on enhancing neo4j-core to support the Bolt Routing protocol---which is just Bolt with a little service discovery, er, bolted on---so that my application will be compatible. The highly šŸš§ work-in-progress šŸš§ Core branch lives here, with a corresponding wrapper branch here. It's something that I'll be happy to yield back to the community when it's ready, but I'm stuck on part of the load balancing logic and I was hoping folks with deeper insight into the overall architecture of the gem might have some thoughts or insights to share.

Currently, the adapter checks a connection out from a pool at the start of every transaction; the transaction hangs on to the connection for its duration, since each transaction has to start and end at the same database instance, then releases it back to the pool. The service discovery logic to disambiguate read-type connections from write-type connections is in place, and works fine. The rub is that when a transaction starts, we issue the Cypher BEGIN statement before we know what the meat of the transaction will look like---we don't know, at that point, whether the transaction will contain exclusively read statements or whether it will include write statements (which must pass through the cluster leader).

The JavaScript adapter library drives this problem down to the consumer---it assumes all queries are write queries and should pass through the leader unless the application specifically requests a read-only transaction. I don't think that'll work here, though.

What I would love to be able to do is have the complete battery of queries available before firing them off, so that I could check for whether it was safe to send them to a read instance or whether they needed to go through a write instance. But I can't see a way to make this work in the current architecture of the gem.

If any of the maintainers, or anyone with deeper insight into the gem's inner workings, has any thoughts, I'd be only too glad to hear them. Thanks in advance!

phyllisstein commented 5 years ago

I think I may have gotten a handle on it after taking another look with fresh eyes: https://github.com/ignota/neo4j-core/commit/818f8db149167d3a13878fbe4b55d6cd350f7778. The adapter now overrides queries, transaction, and new_or_current_transaction to assign an access mode to the transaction when queries are flowing through it, falling back on :write mode if no queries are given. Initial testing in my cluster suggests that this works acceptably well, but I'd be happy to hear suggestions for better options.

klobuczek commented 5 years ago

@phyllisstein This is a great work you have been doing here. Our team took over the maintenance of neo4j.rb. As a first step, we would like to produce a neo4j ruby driver that is totally analogous to the other 4 official language drivers. As a proof of concept, I have created a jruby driver based on the java driver. I would love if you could review the manual examples at https://github.com/neo4jrb/neo4j-driver/blob/master/spec/support/dev_manual_examples.rb and give us some feedback regarding the proposed API. As a 2nd step, we were planning to leverage the soon to be released seabolt (C driver) to implement an MRI gem to exactly the same API so both the java and c versions can be switched without any changes to the client. However, seeing how much work you have done on the pure ruby driver and how much understanding you seem to have I wanted to ask if you would be interested to package that driver according to the proposed API in https://github.com/neo4jrb/neo4j-driver/. Once we have a general purpose ruby driver with an official API we would take on adapting neo4j and neo4j-core to use that driver, but I would like to implement the driver completely independently from and without a significant influence by those gems to guarantee a good separation. Let me know what you think?

joshjordan commented 5 years ago

@phyllisstein @klobuczek any updates on bolt+routing in the Ruby gem?

ekampp commented 4 years ago

@klobuczek , would the seabolt implementation support bolt+routing? Or would that still have to be built on top?

phyllisstein commented 4 years ago

I'm sad to say I've moved away from Ruby and Rails for my work. Sad because I can't be all that helpful; and sad because even just skimming the Neo4j::Driver code made it clear how far it's come in the past year, and I had to miss out.

It looks as though the Seabolt-based driver does support bolt+routing: see dev_manual_examples.rb:29-49. What kind of work would be necessary to use Neo4j::Driver with Neo4j.rb, I can't really say. You might want to start in Neo4j::SessionManager.adaptor_class_by_type. Depending on the extent of API drift, it could be a quick and easy hack or a big ole mess.

ekampp commented 4 years ago

@phyllisstein, thanks for that update!

In your code, you reference a bolt_routing adapter however, I cannot seem to find that code anywhere. Can you perhaps help me out there?

phyllisstein commented 4 years ago

Those lines pull in changes I made to neo4j-core. I don't think the adapter I wrote is quite production-ready, however: there's no test suite, a lot of the idioms are borrowed from the Neo4j JavaScript driver, and the lack of support for bookmarks is pretty egregious.

I was suggesting you could try dropping Neo4j::Driver in there and seeing how far that got you But after looking at its code, I can see the answer would likely be "not very." As an alternative, you might use neo4j-core's implementation of the Bolt adapter as a starting-point for creating a very thin wrapper around Neo4j::Driver. You'd want to construct the same basic API surface, so that Neo4j.rb could consume it, but call out to Neo4j::Driver when interacting with the database, so that you could avail yourself of its improvements.