WARNING

This is a concept of art from someone who has took programming seriously starting a year or two ago, I have been writing code since early in my days but that was mainly for fun. I want to learn which is why I am leaving this concept here for others to improve on and also for myself to learn. Thank you.

Idea

Here is my general idea for the future which is to support load-balancing with the database, as I am still very immature at Java, feel free to give out your suggestions and opinions that could improve this (much better with code demonstration but that is purely optional).

Specification

The idea I have came up is a simple concept on paper which involves nodes that would act as load balancers for the main process.
The nodes (or childrens) would act as another gateway/endpoint for requests to enter in and to receive, sending all the requests towards the main server who would then decide which node will save this data on, in the meantime, broadcast the data to all nodes who would then update their cache immediately.
This way, the data is balanced between several node servers which, from my shower thoughts, thinks will make storage space a minor problem as all the data is stored on different nodes with all of the processes actually having the data already cached.
Now, the concept may sound a bit weird. There are always several issues with this, like for example, how would all nodes and the main process fill their cache if all the data is scattered everywhere. This is where another option comes in: main storage which is a concept of mine where the main storage during shutdown will save every single data it has on its storage temporarily.
Though, in general, all the nodes should report the data they have on their storage immediately to the main process on boot-up which the main process would immediately distribute to all the other nodes to store on their cache. As for collisions, the hash of the (identifier + collection + database name) will be assigned to a node and this node assignment will be stored all nodes for another thing.

What if the main process disconnects?

This is where the hash storage comes in, when the main process disconnects without notice, all the nodes will immediately decide on a temporary master node who will take place of the original master node until it comes back online. This temporary master node will ask for all the hash storages of all the nodes and see if there are any conflicts, if there is then it will pick the version that has the highest amount of nodes with the same version and distribute it.
After which, the temporary master will then go around and proceed with the tasks of the original master since all nodes have the same cache inside of them (this will be checked as well by combining all the data values and identifiers length).
Once the master process comes back online, the temporary master will immediately do a handover which is a process that keeps the original master up to date by sending all its data to the main master.

Protocol

There is also an issue with the protocol to use since HTTP will not work as we need to know when a node or the main server suddenly disconnects which leaves us with one choice which is the same protocol as before: websocket but on its own endpoint: ws://127.0.0.1:5563/node which has its own separate functions, all the nodes will also have to identify themselves when connecting with the Authorization code (Authorization: Node [TOKEN]) to which the server will check its own configuration to see if this node is actually registered on the list (to prevent hijacking).

Data Assignment to Node

As written earlier, each item will have their unique hash (item + collection + database)'s name which will be assigned to a hash storage with the node which is assigned in handling and saving the data's id saved together with it.

More details to be added and this concept will slowly be improved over time, please note that THIS IS STILL A CONCEPT IDEA AND HAS PLENTY OF FLAWS.

Second Concept

A second concept has appeared! This time, it's a bit more simplified, we could go with a load-balancing route where one or two servers will be receiving and sending the requests to the servers who will be in charge. The load balancer will also be in charge of identifying which server is (x) data located through hash.

Diagram

Load Balancing(1)

Load Balancer's Purpose

The load balancer's purpose is to distribute the requests evenly between all the nodes whether it'd be for a simple GET request or an extra complicated aggregated filter request.

To further lower the load of the servers, we could also cache the responses but I doubt that would do any effect as all the items inside the server are already cached inside of them which means the process of retrieving the request is simply Load Balancer -> Node -> Internal Cache (if it exists) else -> Read from file.

Hash Identification

To help identify which server is responsible for which data, we could assign an identifier or otherwise hash between them. This, itself also has problems which will be written down below but to simplify, you can think of a hash as the 128 number being divided by x among of servers.

For example, we have 10 nodes with the value of the hash being at 128. All 10 nodes would be assigned a division which is simply a divisible number of 128 which is 12 numbers. Node 1 will be responsible for any data that is within 0-11 which is its responsible side, Node 2 will be responsible for 12-24 and so forth.

A request with an identifier of an item that has the key at division 0-11 will head to node 1 and so forth.

BUT WAIT, how are you going to do aggregation here?

Yes, Aggregation is definitely hard with this setup since unless we ask all the nodes for their data of (x) collection at the same time, there is little to no way.

Collection Partitioning This is where collection partitioning comes in which instead of distributing individual items among shards, we could place the key hash to a collection and the same concept kicks in.

BUT WAIT AGAIN, how will you do a database-wide aggregation or a database-wide filter?

And here we are again with a simple answer, database-wide partitioning which is basically collection partitioning but for databases, same concept and same drill. We could have the application support all three types if they don't mind killing performance for reduced load and reduced resource usage.

If the host insists on item-wide distribution then we will respect their choice and have the application run at that configuration with aggregations for collection-wide handled by the load balancer who will ask for all the data from all servers and compile them into a single response that will be sent back to the client after all nodes responds, the same goes for collection-wide distribution.

What happens if a node dies?

Yes, this is also a concern of mine but if a node (which is basically a separate server) goes down then all the data inside of that node WILL GO UNAVAILABLE since the entire concept of load balancing is to balance the load between all the servers and having a primary server that will take in all the data and redistribute them when an entire node goes down doesn't seem like a proper load balancing concept for me. Feel free to leave your opinions over this but I am going to go with this route, you must get the node back up or the data over there will definitely be unavailable.

This is the end of the concept for now and will be improved on more time.

ShindouMihou / RoseDB

Replication/Load-balancing #9