Open spog opened 7 years ago
Geo replication where all locations are active and serving data can be achieved by spreading out the individual clusters. So if 3 nodes in a cluster, place each in a separate location. This will of course mean writes will have a larger latency. But if you send queries to nodes where the user is closest, reads will be generally quick (unless safe flag is used, then it will have to verify with other nodes).
What you have in mind is another layer on top lets say locations. So we would have: nodes -> clusters -> locations
Some clusters would be active in one location and other locations would just be backups for that cluster. We were thinking about something similar.
The main question is what happens when a location goes offline. To make it work seamless is quite a difficult task. It also requires clients to have location reconnect logic in them. Writes must also replicate across at least two locations (so latency).
Eventually consistent locations mean when one goes offline and you get some writes to a new location, you may have just thrown away some writes from that offline location. You told the client write was complete, but if it has not managed to replicate to another location it may still disappear.
You either sacrifice latency or you sacrifice consistency. I guess the best option would be to allow both scenarios and leave it up to the user.
Thank you very much for your response.
Would it help adding internal location information:
A bold suggestion for location info:
And regarding locations consistency I agree that this requirement is highly application dependent.
thanks again, Samo
One relevant point for this discussion is that if you have a 3 node cluster and all your client connections are to one node, replication leaders for actors will mostly live on that node. Since we use raft any node can either be a leader (executes SQL, reads and writes) or a follower (passively receives data from leader on writes).
... so nodes would not be naturally load balanced?
The other option would mean 2/3 of queries would be proxied by the server you are talking to. Thus increasing latency and have a bigger chance of something failing.
To support cross DC replication we need extend the stripe of clusters with parity. Not looking to extra network latency, DC can have good uplinks and writes may be not very intensive and massive. There are another solutions for realtime data processing.
I like idea of DC redundancy - and here is my point of view. As i can see, in order to achieve a true DC redundancy and failover ActorDB need to support 1 thing:
With this we can change setup config on the fly - adding/removing clusters, their nodes and manipulate redundancy level from dangerous stripe to the N - any_level, achieving read speed improvement with required level of safety.
Thank you for the information. We will keep it in mind for future versions
While exploring your impressive project, I dared to think about a potential geo-replication scenario in ActorDB.
Basic idea is to have all data available at all locations:
I thought such geo-scenario might work for ActorDB, since a particular actor (i.e. a mobile user) typically operates from one location at the time. However groups of mobile users might be active at different locations simultaneously.
I am more a system level programmer and I do not have a lot of knowledge regarding databases, so please excuse me if you find this post irrelevant.