Optimized Binlog Replica Applier with Transaction Batching
The binlog replica applier reads and applies updates from the binary log (binlog) of a primary MySQL server to a replica. This PR introduces an optimization to batch multiple primary transactions into a single replica transaction whenever possible, aiming to enhance performance and reliability.
Benefits
Reduced Overhead: Batching transactions reduces the overhead of frequent commits, improving replication efficiency. This enables the delta buffer to accumulate more changes and thus to take the advantages of columnarization.
Stronger ACID Guarantee: By aligning the replica transaction boundary with the primary transaction boundary, the system provides a stronger ACID guarantee. It ensures that clients never observe intermediate states of a primary transaction, enhancing robustness against replica crashes and restarts. This is a significant advantage over non-transactional systems like ClickHouse.
Transaction Batching Process
Transaction Start: The applier detects a new primary transaction from the binlog. For GTID-based replication, this is typically signaled by a GTID event followed by a BEGIN query event.
Batch Extension: Instead of committing each transaction individually, the applier attempts to batch them using the extendOrCommitBatchTxn function. As long as the current replica transaction can be safely extended (e.g., the changes from a new primary transaction are pure ROW-format data changes), it adds more changes to the batch.
Batch Commit: When the batch reaches a boundary (e.g., a DDL statement) or it is no longer optimal to extend it, the batch is committed. This marks the end of a replica transaction that encapsulates multiple primary transactions.
Implementation
The applier tracks whether it is inside a batched transaction using the ongoingBatchTxn flag and manages transaction boundaries with several other state variables: dirtyTxn, dirtyStream, pendingPosition, etc. The implementation can be viewed as a state machine that switches between several states. The extendOrCommitBatchTxn function handles the decision to either extend an ongoing batch by adding more primary transactions to the current replica transaction or to commit the batch, finalizing the replica's transaction.
Currently, a batched transaction is closed and a new one is started in the following scenarios:
200 milliseconds have passed since the last commit.
128MB of binlog payload has been processed.
A DDL statement that changes the schema has been received.
Previously, the applied binlog position was stored in a special file. In this PR, the binlog position is stored transactionally in DuckDB instead. This makes the system robust to unexpected shutdown.
Optimized Binlog Replica Applier with Transaction Batching
The binlog replica applier reads and applies updates from the binary log (binlog) of a primary MySQL server to a replica. This PR introduces an optimization to batch multiple primary transactions into a single replica transaction whenever possible, aiming to enhance performance and reliability.
Benefits
Transaction Batching Process
Transaction Start: The applier detects a new primary transaction from the binlog. For GTID-based replication, this is typically signaled by a
GTID
event followed by aBEGIN
query event.Batch Extension: Instead of committing each transaction individually, the applier attempts to batch them using the
extendOrCommitBatchTxn
function. As long as the current replica transaction can be safely extended (e.g., the changes from a new primary transaction are pure ROW-format data changes), it adds more changes to the batch.Batch Commit: When the batch reaches a boundary (e.g., a DDL statement) or it is no longer optimal to extend it, the batch is committed. This marks the end of a replica transaction that encapsulates multiple primary transactions.
Implementation
The applier tracks whether it is inside a batched transaction using the
ongoingBatchTxn
flag and manages transaction boundaries with several other state variables:dirtyTxn
,dirtyStream
,pendingPosition
, etc. The implementation can be viewed as a state machine that switches between several states. The extendOrCommitBatchTxn function handles the decision to either extend an ongoing batch by adding more primary transactions to the current replica transaction or to commit the batch, finalizing the replica's transaction.Currently, a batched transaction is closed and a new one is started in the following scenarios:
Previously, the applied binlog position was stored in a special file. In this PR, the binlog position is stored transactionally in DuckDB instead. This makes the system robust to unexpected shutdown.