neo4jrb / neo4j-core

A simple unified API that can access both the server and embedded Neo4j database. Used by the neo4j gem
MIT License
99 stars 80 forks source link

Implement Bolt Routing protocol (closes #322). #323

Closed phyllisstein closed 5 years ago

phyllisstein commented 6 years ago

Fixes #322.

This pull introduces/changes:

Caveats:

Grateful as always for your thoughts and comments! Glad to be able to give something back to a project that's been invaluable to me these past few months. 💖

P.S.: Let me know if I can help set up a dev environment with Neo4j EE! Don't want you to have to go into it sight unseen.

Pings: @cheerfulstoic @subvertallchris

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.09%) to 90.724% when pulling a77b2c01a762964b01ba47368db078c5b650ff48 on ignota:bolt-routing into 293eb02cc11210e88bfa69d778f563e61f2b0de7 on neo4jrb:master.

cheerfulstoic commented 6 years ago

Hey! Sorry, I'm not currently maintaining the gem but it's awesome that somebody worked on this!

Some former colleagues of mine were potentially going to be working on this, so I would like to give them a heads up:

@leviwilson

@StephenTurley

@walterg2

StephenTurley commented 6 years ago

Glad to see this being worked on too. Thank you! I recall the BOLT implementation having some performance issues compared to HTTP. Has that changed at all?

cheerfulstoic commented 6 years ago

I believe when I was last testing them they were comparable, but then I also implemented some improvements to the performance. I don't remember the results, but I think it was something like a 5-15% improvement from before. Those changes are on master of neo4j-core. The biggest change didn't actually have anything to do with the Bolt protocol but had to do with the fact that logging string was always being generated (regardless of if the log level was set to log or not). So the improvement should mainly be on production where you don't generally log the queries.

phyllisstein commented 6 years ago

@cheerfulstoic Thanks for pinging the right people! Sorry to hear you've stepped away from the gem, but good luck with whatever comes next.

@StephenTurley I've heard the same thing about performance issues, but unfortunately this changeset wouldn't touch them---it's still using the same Bolt protocol under the hood, it just adds the service discovery features from Bolt Routing onto the top.

klobuczek commented 6 years ago

@phyllisstein I just had literally a 5 minutes look through the PR so please excuse me if I missed some obvious details. Is the goal of the driver to provide load balancing for read-only databases? One of the important features of bolt+routing is ensuring that one can read their own writes which is solved with bookmarks and which would need changes in the neo4j gem, its API and all the way to client code.

joshjordan commented 5 years ago

@phyllisstein have you continued to use this in production?

@klobuczek do you see any obvious issues? It seems that without the service discovery in this PR, we are left to find our own workarounds for dynamically discovering the elected leader in causual clusters, no? In other words, is this PR strictly an improvement despite not implementing bookmarks?

phyllisstein commented 5 years ago

Hey! Forgive me for neglecting this PR (and @klobuczek's very thoughtful and generous feedback in #322). I have indeed been using this code in production since I first submitted it. That said, I use the word "production" extremely loosely in this context: the application I'm running sees very little traffic, so it's not impossible there are bugs lurking.

Reviewing the branch as it stands, I'd flag two areas of especial concern. The first is that the heuristics for creating and destroying sessions feel a little shady (there's this, e.g., and all'a this). If session handling is broken, your app might hang on to more open connections than it needs, hogging server memory and preventing Neo4j from gracefully accepting new connections. The second is that I plum quit before implementing causal chaining. This has already introduced one or two bugs into my own app, especially with background jobs. In cases (like file uploads) where the asynchronous code runs soon after graph CRUD code, it's not impossible (indeed, not uncommon) for the job's environment to connect to a read replica that hasn't received the replicated transaction yet.

The first issue, sadly, is a bit of a ¯\_(ツ)_/¯ at this point, but I'll try to find some time to put some thought into the second this weekend. If you wind up experimenting with this branch, I'd be curious to know how it goes! And if you have any questions about your specific use case, I'd be happy to talk through them over e-mail.

klobuczek commented 5 years ago

@phyllisstein would you be interested in contributing or at least reviewing https://github.com/neo4jrb/neo4j-ruby-driver? It is meant to be a fully compliant neo4j driver on which are going to base the neo4j/neo4j-core gem. So unfortunately this PR will not make it to master anymore, but your experience would be invaluable in building the new solution on top of the driver.

phyllisstein commented 5 years ago

I've glanced at the driver code once or twice. It definitely looks impressive, and I'm glad to see that it's going to obviate this (imperfect, buggy) work. I've been looking at migrating my app to another stack, but kinda reluctantly. I'd love to learn more about where the Ruby driver code stands, what the delta is between its current state and getting it integrated with the OGM, and whether there's anything I can do to contribute.

Do you have a Slack set up or something? If not, my e-mail's in my profile, and I'd love to talk more.

klobuczek commented 5 years ago

Hi @phyllisstein we are all here: https://gitter.im/neo4jrb/neo4j#