HA (High Availability) and online Backups

tsetso-spread commented 1 week ago

Hello! We are planning on using DozerDB in production and would like our deployment to be HA and be able to do backups without stopping the database server. Since this is not part of the implemented or planned feature list, we went through the closed/open issues and found https://github.com/DozerDB/dozerdb-core/issues/5 and more specifically: for online backups - we will test with this approach with out dataset size: https://github.com/DozerDB/dozerdb-core/issues/5#issuecomment-1960447237 for HA - could we use this how-to? https://github.com/DozerDB/dozerdb-core/issues/5#issuecomment-1949513918

jmsuhy commented 1 week ago

This is not answering your question, but I wanted to let you know that we are exploring the addition of the ‘getRoutingTable’ and other load balancing supporting functions in the DozerDB plugin.

This would enable users to manually define routing table information based on configurations from HAProxy, NGINX, other load balancers, or even just defining individual instances to be load balanced over by the drivers.

By implementing this, DozerDB would support Neo4j drivers that rely on the getRoutingTable function for load balancing and routing, allowing seamless integration with existing driver functionality.

In the end the getRoutingTable function returns json similar to: { "ttl": 300, "servers": [ { "addresses": ["neo4j-primary:7687"], "role": "WRITE" }, { "addresses": ["neo4j-read1:7687", "neo4j-read2:7687"], "role": "READ" }, { "addresses": ["neo4j-router:7687"], "role": "ROUTE" } ] }

tsetso-spread commented 1 week ago

Would this be build on top of neo4j's Clustering mechanism, or separately?

jmsuhy commented 1 week ago

Neo4j Enterprise supports clustering with high availability, but not sharding like Elasticsearch. Neo4j Community Edition, however, lacks clustering altogether.

In our approach, you would set up multiple Neo4j instances with DozerDB, then manually configure the routing table via a JSON configuration that can be updated live. This allows you to leverage the built-in routing functionality in Neo4j drivers.

Unlike Enterprise Edition, where routing tables are automatically managed as the cluster grows, this approach requires you to manually update the routing table whenever a new instance is added.

DozerDB / dozerdb-core

HA (High Availability) and online Backups #29