Closed yew11 closed 4 years ago
Leader-based replication is great for read heavy applications.
If an application reads from an asynchronous follower, it may see outdated data. If we wait a while, all the follower will have the same data, Eventual Consistency
if the system is operating near capacity or if there is a problem in the network, the lag can easily increase.
In this case, we need read-after-write consistency or read-your-writes consistency. If only guarantee for the current writer, not guaranteed for other users. How do we implement?
But, what if user have multiple devices, additional issues to consider.
When reading asynchronous followers a user can see things moving backward in time. This happens when a user make multiple reads from different replicas(say two).
If the first reads returns something just been written, but the second reads happens to read from a datacenter that hasn't been written to, it could be very confusing for the user.
Monotonic reads guarantee this does not happen, it's a lesser guarantee than strong consistency, but a stronger guarantee than eventual consistency. It make sure that each user always makes their reads from the same replica. Could be chosen based on a hash of the user ID.
consistent prefix reads: if a sequence of writes happens in a certain order, then anyone reading those writes will see them appear in the same order.
This problem has in partitioned databases. In many distributed dbs, different partitions operate independently. No global ordering of writes. All write to the same partition? Not so efficient
master-master or active/active replication
Have a database with replicas in several different datacenters. One leader per datacenter.
Performance: In a single-leader configuration, every write must find the one leader, may increase latency to writes. In a multi-leader configuration, every write can be processed in the local datacenter and is replicated asynchronously to other datacenters. network delay is hidden from users.
Tolerance of datacenter outages: Single: A leader fails, failover can promote a follower in another datacenter to be leader. Multi: each datacenter can continue operate independently.
Downside for multi-leader: same data may be concurrently modified in two different datacenters, and those weri
Replication
replication means keep a copy of the same data on multiple machines that are connected via a network.
The difficulty in replication is handling changes of replicated data.
Three Algorithms for replicating changes b/w nodes, almost all use one of these three:
Leaders and Followers
With multiple replicas, how do we ensure that all the data ends up on all the replica?
The most common approach:
Synchronous vs. Asynchronous replication Follower 1 is synchronous, while the follower 2 is asynchronous.
About synchronous:
For above reason, it is impractical for all followers to be synchronous, any outage, would cause the whole system to grind to halt.
In practice, if you enable synchronous replication on a database, it usually means that one of the followers is synchronous, and the others are asynchronous. If the synchronous follower becomes unavailable or slow, one of the asynchronous followers is made synchronous. This guarantees that you have an up-to-date copy of the data on at least two nodes: the leader and one synchronous follower. This is called semi-synchronous
Often, leader-based replication is configured to be completely asynchronous, this means a write is not guaranteed to be durable. The advantage is the leader can continue processing writes, even if all of its followers have fallen behind.
Setting up new followers
Simply copying data files from one node to another not sufficient: clients are constantly writing to the db, the data is always in flux. So the process is:
Handling Node outages
Goal: keep the system as a whole running despite individual node failures, and keep the impact of a node outage as small as possible.
Follower failure: Catch-up recovery On its local disk, each follower keeps a log of data changes it has received from the leader. If crashes, the follower can recover easily from its logs.
When the follower is online, it can requests data changes during the downtime.
Leader failure: Failover On of the followers needs to be promoted to be the new leader: clients need to write data to the new leader, and other followers need to start consuming data changes from the new leader. This is called failover
Failover is fraught with things that can go wrong:
Replication logs implementation
Statement-based replication
the leader logs every write request that it executes and sends that statement log to its followers (Every INSERT, UPDATE, DELETE).
problems:
NOW()
orRAND()
, is likely to generate a different value.It is possible the leader replace nondeterministic functions with a fixed value, but there are so many edges cases to consider.
Write-ahead log shipping Leader: log is an append-only sequence of bytes containing all writes to the db. When the follower process the log, it builds a copy of the exact same data structure as found on the leader.
problem: log describes the data on a very low level: a WAL contains details of which bytes were changed in which disk blocks: This makes replica closely coupled to the storage engine. Fucked if the db changes its storage format.
Logical (row-based) log replication
use different log formats for replication and for the storage engine, which allows the replication log to be decoupled from the storage engine internals.
A logical log for a relational db is usually a sequence of records at the granularity of a row.
Trigger-based replication A trigger lets you register custom application code that is automatically executed when a data changes occurs in a db system. But larger overhead and prone to bugs.