memgraph / documentation

The official documentation for Memgraph open-source graph database.
https://memgraph.com/docs
MIT License
9 stars 10 forks source link

multi-tenancy in memgraph #927

Open ramesh0279 opened 4 months ago

ramesh0279 commented 4 months ago

We currently use neo4j and the way we achieve horizontal scalability for our multi-tenant cluster is by

  1. creating a database per tenant.
  2. We have 2 sets of 3 node clusters. Each 3 node cluster supports 100 tenants.
  3. With the 2 sets, we support 200 tenants.
  4. Both the 3 node cluster talk to each other and support implicit routing.
  5. Our Java clients talk to only one host address and neo4j takes care of the routing across the 2 sets of nodes.
  6. Is this supported by memgraph? I could not find this in the documentation.
hal-eisen-MG commented 4 months ago

@ramesh0279 These two sets of 3 node clusters, are they separated into two different data centers?

ramesh0279 commented 4 months ago

No they are in the same data center. The reason for having two clusters is to make it more horizontally scalable.

katarinasupe commented 4 months ago

Hi @ramesh0279, thank you for asking the question!

Based on what you described, I concluded that you need:

  1. Replication supported in multi-tenant environment
  2. Load balancing between clusters

Please correct me if I am wrong.

Here's what I can say about that:

  1. The community version of replication does not support multi-tenant data replication. Still, we put some work into that and created Experimental replication feature, which supports multi-tenant data replication for which you need an Enterprise license. If you require HA, check our Enterprise HA feature.

  2. Can you explain a bit more about your expectations here? I suppose this is currently not supported in Memgraph, but I would like to understand your need for such a setup a bit more so I can make the best suggestions on what to use in Memgraph. Are you deploying this on Kubernetes or?

I would recommend you join our Discord server for more questions 😄 Also, to discuss the second point and how to tweak Memgraph based on your specific use case, please book an office hours call (it will be easier to go over this via call).

ramesh0279 commented 4 months ago

We are deploying on kubernetes. We have modeled each tenant as a separate database. We initially had one cluster with 3 nodes. (1 leader and 2 replicas). We starting seeing performance issues when we scaled up to more than 100 tenants. To solve this, we created another 3 node cluster (1 leader + 2 replicas) to handle newer tenants.

The clients connect to only one address and neo4j handles the implicit routing to cluster 1 or cluster 2 depending on the which tenant database you are trying to connect to.

Hope I am making sense.

One more question is that on WRITE transaction, if we specify ASYNC to improve performance, is there a concept similar to bookmarks where the subsequent read is assured of getting the latest updated write data?

Thank you.