apecloud / myduckserver

MySQL & Postgres Analytics, Reimagined
184 stars 8 forks source link

feat: add a tool for launching MyDuckServer as a MySQL replica instance #101

Closed TianyuZhang1214 closed 1 month ago

TianyuZhang1214 commented 1 month ago

What's New

solves #98, #99.

Feature

A new tool has been added to launch MyDuckServer as a MySQL replica instance. All related scripts are stored under the devtools/replica/ directory.

To create a MySQL replica instance using MyDuckServer, you can run the following command (detailed in README.md):

bash create_replica.sh --mysql_host <mysql_host> --mysql_port <mysql_port> --mysql_user <mysql_user> --mysql_password <mysql_password>

Structure

The tool is composed of three key components:

  1. MySQL Shell Installation (if not already installed): install_mysql_shell.sh is responsible for setting up MySQL Shell.

  2. Snapshot Dump: Using mysqlsh -- util copy-instance, start_snapshot.sh performs a full instance dump. The tool dynamically determines the optimal thread count based on system CPU cores (2 * core_count) and sets chunk size to 256 MB based on benchmarking results (detailed below).

  3. Delta Synchronization: After the snapshot, start_delta.sh starts delta replication by executing CHANGE REPLICATION SOURCE TO, initiating continuous data sync between MyDuckServer and MySQL.

Benchmarking

Environment

CREATE TABLE `test_sync_t_0` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `md5` varchar(32) NOT NULL,
  `database_ip` varchar(64) NOT NULL,
  `database_port` int NOT NULL,
  `table_name` varchar(255) NOT NULL,
  `sql` text NOT NULL,
  `create_time` datetime NOT NULL DEFAULT '1970-01-01 00:00:01',
  `occur_time_first` datetime NOT NULL DEFAULT '1970-01-01 00:00:01',
  `occur_time_last` datetime NOT NULL DEFAULT '1970-01-01 00:00:01',
  `real_sql` text NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8mb3;

Results

Normal

Thread Count Chunk Size (MB) Throughput (MB/s)
16 64 307.64
16 256 330.98
16 1024 364.65
32 256 354.60
32 1024 401.01

With (--loglevel=6)

Thread Count Chunk Size (MB) Throughput (MB/s)
16 64 81.90
16 256 146.27
16 1024 181.79
32 256 195.97
32 1024 187.45