apecloud / myduckserver

MySQL & Postgres Analytics, Reimagined
184 stars 8 forks source link

feat: flush the delta buffer to duckdb in batched transaction #93

Closed fanyang01 closed 1 month ago

fanyang01 commented 1 month ago

Optimized Binlog Replica Applier with Transaction Batching

The binlog replica applier reads and applies updates from the binary log (binlog) of a primary MySQL server to a replica. This PR introduces an optimization to batch multiple primary transactions into a single replica transaction whenever possible, aiming to enhance performance and reliability.

Benefits

Transaction Batching Process

  1. Transaction Start: The applier detects a new primary transaction from the binlog. For GTID-based replication, this is typically signaled by a GTID event followed by a BEGIN query event.

  2. Batch Extension: Instead of committing each transaction individually, the applier attempts to batch them using the extendOrCommitBatchTxn function. As long as the current replica transaction can be safely extended (e.g., the changes from a new primary transaction are pure ROW-format data changes), it adds more changes to the batch.

  3. Batch Commit: When the batch reaches a boundary (e.g., a DDL statement) or it is no longer optimal to extend it, the batch is committed. This marks the end of a replica transaction that encapsulates multiple primary transactions.

Implementation

The applier tracks whether it is inside a batched transaction using the ongoingBatchTxn flag and manages transaction boundaries with several other state variables: dirtyTxn, dirtyStream, pendingPosition, etc. The implementation can be viewed as a state machine that switches between several states. The extendOrCommitBatchTxn function handles the decision to either extend an ongoing batch by adding more primary transactions to the current replica transaction or to commit the batch, finalizing the replica's transaction.

Currently, a batched transaction is closed and a new one is started in the following scenarios:

Previously, the applied binlog position was stored in a special file. In this PR, the binlog position is stored transactionally in DuckDB instead. This makes the system robust to unexpected shutdown.