codership / galera

Synchronous multi-master replication library
GNU General Public License v2.0
447 stars 176 forks source link

Replication does not continue after mysqldump SST #191

Open philip-galera opened 9 years ago

philip-galera commented 9 years ago

If a single-node cluster is established and then an empty, running server is added to it via mysqldump SST, further updates on either server are not replicated.

At the same time, SHOW STATUS on both nodes shows a fully healthy 2-node cluster .

philip-galera commented 9 years ago

To reproduce, put the following in suite/galera/t/galera#191.cnf:

!include include/default_mysqld.cnf

[mysqld]
binlog-format=row
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=2
# log-bin=mysqld-bin

wsrep_node_address=127.0.0.1
wsrep_causal_reads=ON
wsrep_sync_wait = 7

[mysqld.1]

[mysqld.2]

[ENV]
NODE_MYPORT_1= @mysqld.1.port
NODE_MYSOCK_1= @mysqld.1.socket

NODE_MYPORT_2= @mysqld.2.port
NODE_MYSOCK_2= @mysqld.2.socket

NODE_GALERAPORT_1= @mysqld.1.#galera_port
NODE_GALERAPORT_2= @mysqld.2.#galera_port

NODE_SSTPORT_1= @mysqld.1.#sst_port
NODE_SSTPORT_2= @mysqld.2.#sst_port

And in suite/galera/t/galera#191.test:

--source include/have_innodb.inc

--connect node_1, 127.0.0.1, root, , test, $NODE_MYPORT_1

GRANT ALL PRIVILEGES ON *.* TO 'sst' IDENTIFIED BY 'sst';

--disable_query_log
--eval SET GLOBAL wsrep_provider='$WSREP_PROVIDER'
--eval SET GLOBAL wsrep_provider_options='base_port=$NODE_GALERAPORT_1'
--enable_query_log
SET GLOBAL wsrep_cluster_address='gcomm://';
SET GLOBAL wsrep_sst_auth = 'sst:sst';

CREATE TABLE t1 (f1 INTEGER) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1);

--connect node_2, 127.0.0.1, root, , test, $NODE_MYPORT_2

GRANT ALL PRIVILEGES ON *.* TO 'sst' IDENTIFIED BY 'sst';

--disable_query_log
--eval SET GLOBAL wsrep_sst_receive_address = '127.0.0.2:$NODE_MYPORT_2';
SET GLOBAL wsrep_sst_method = 'mysqldump';
--eval SET GLOBAL wsrep_provider='$WSREP_PROVIDER'
--eval SET GLOBAL wsrep_provider_options='base_port=$NODE_GALERAPORT_2'
--eval SET GLOBAL wsrep_cluster_address='gcomm://127.0.0.1:$NODE_GALERAPORT_1'
--enable_query_log

--let $wait_condition = SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 't1';
--source include/wait_condition.inc

--let $wait_condition = SELECT COUNT(*) = 1 FROM t1;
--source include/wait_condition.inc

--connection node_1
INSERT INTO t1 VALUES (2);

--connection node_2
INSERT INTO t1 VALUES (3);

--connection node_1
SELECT * FROM t1;

--connection node_2
SELECT * FROM t1;

The two SELECTs at the end will return different results.

philip-galera commented 9 years ago

Here is another variant of the test that causes node #2 to exit:

--source include/have_innodb.inc

--connect node_1, 127.0.0.1, root, , test, $NODE_MYPORT_1

GRANT ALL PRIVILEGES ON *.* TO 'sst' IDENTIFIED BY 'sst';

--disable_query_log
--eval SET GLOBAL wsrep_provider='$WSREP_PROVIDER'
--eval SET GLOBAL wsrep_provider_options='base_port=$NODE_GALERAPORT_1'
--enable_query_log
SET GLOBAL wsrep_cluster_address='gcomm://';
SET GLOBAL wsrep_sst_auth = 'sst:sst';

CREATE TABLE t1 (f1 INTEGER) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1);

--connect node_2, 127.0.0.1, root, , test, $NODE_MYPORT_2

GRANT ALL PRIVILEGES ON *.* TO 'sst' IDENTIFIED BY 'sst';

--disable_query_log
--eval SET GLOBAL wsrep_sst_receive_address = '127.0.0.2:$NODE_MYPORT_2';
SET GLOBAL wsrep_sst_method = 'mysqldump';
--eval SET GLOBAL wsrep_provider='$WSREP_PROVIDER'
--eval SET GLOBAL wsrep_provider_options='base_port=$NODE_GALERAPORT_2'
--eval SET GLOBAL wsrep_cluster_address='gcomm://127.0.0.1:$NODE_GALERAPORT_1'
--enable_query_log

--let $wait_condition = SELECT COUNT(*) = 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 't1';
--source include/wait_condition.inc

--let $wait_condition = SELECT COUNT(*) = 1 FROM t1;
--source include/wait_condition.inc

--connection node_1
CREATE TABLE t2 (f1 INTEGER);

--connect node_1a, 127.0.0.1, root, , test, $NODE_MYPORT_1
INSERT INTO t2 VALUES (1);
COMMIT;

--connect node_2a, 127.0.0.1, root, , test, $NODE_MYPORT_2
INSERT INTO t2 VALUES (2);

Results in:

2014-12-02 11:13:16 3087 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1146, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 0 seqno: 1)
2014-12-02 11:13:16 3087 [ERROR] Slave SQL: Error executing row event: 'Table 'test.t2' doesn't exist', Error_code: 1146
2014-12-02 11:13:16 3087 [Warning] WSREP: RBR event 3 Write_rows apply warning: 1146, 1
2014-12-02 11:13:16 3087 [Warning] WSREP: Failed to apply app buffer: seqno: 1, status: 1
         at galera/src/trx_handle.cpp:apply():351
         at galera/src/replicator_smm.cpp:apply_trx_ws():37