codership / galera

Synchronous multi-master replication library
GNU General Public License v2.0
448 stars 176 forks source link

"SYNC message from member 0 in non-primary configuration. Ignored." warning during concurrent IST #204

Open philip-galera opened 9 years ago

philip-galera commented 9 years ago

A test that performs concurrent IST with two joiners and one donor produces the following warning on all nodes

2014-12-15 10:53:26 63625 [Warning] WSREP: SYNC message from member 0 in non-primary configuration. Ignored.

log context:

2014-12-15 10:59:31 64003 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 3)
2014-12-15 10:59:31 64003 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2014-12-15 10:59:31 64003 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2014-12-15 10:59:31 64003 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2014-12-15 10:59:31 64003 [Warning] WSREP: SYNC message from member 0 in non-primary configuration. Ignored.
2014-12-15 10:59:31 64003 [Note] WSREP: STATE EXCHANGE: sent state msg: 4ccd5d36-8430-11e4-907a-eaa1c79b5844
2014-12-15 10:59:31 64003 [Note] WSREP: STATE EXCHANGE: got state msg: 4ccd5d36-8430-11e4-907a-eaa1c79b5844 from 0 (localhost.localdomain)
2014-12-15 10:59:31 64003 [Note] WSREP: STATE EXCHANGE: got state msg: 4ccd5d36-8430-11e4-907a-eaa1c79b5844 from 1 (localhost.localdomain)
2014-12-15 10:59:31 64003 [Note] WSREP: STATE EXCHANGE: got state msg: 4ccd5d36-8430-11e4-907a-eaa1c79b5844 from 2 (localhost.localdomain)
2014-12-15 10:59:31 64003 [Note] WSREP: Quorum results:
        version    = 3,
        component  = PRIMARY,
        conf_id    = 6,
        members    = 1/3 (joined/total),
        act_id     = 4,
        last_appl. = 0,
        protocols  = 0/7/3 (gcs/repl/appl),
        group UUID = 2ba539fb-8430-11e4-8386-8ad6e72614c5
2014-12-15 10:59:31 64003 [Note] WSREP: Flow-control interval: [28, 28]
2014-12-15 10:59:31 64003 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 4)
philip-galera commented 9 years ago

Test case:

in suite/galera_3nodes/t/galera_simultaneous_ist.cnf:

!include ./suite/galera_3nodes/galera_3nodes.cnf

[mysqld.1]
wsrep_provider_options='base_port=@mysqld.1.#galera_port;gcache.size=268435456'

In suite/galera_3nodes/t/galera_simultaneous_ist.test :+1:

#
# This test checks that two nodes can successfully perform IST from a single donor
#

--source include/big_test.inc
--source include/galera_cluster.inc
--source include/have_innodb.inc

--let $galera_connection_name = node_3
--let $galera_server_number = 3
--source include/galera_connect.inc

--connection node_1
SELECT VARIABLE_VALUE = 3 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME= 'wsrep_cluster_size';

CREATE TABLE ten (f1 INTEGER) ENGINE=InnoDB;
INSERT INTO ten VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
CREATE TABLE t1 (f1 INTEGER PRIMARY KEY AUTO_INCREMENT, f2 VARCHAR(1000)) ENGINE=InnoDB;

--connection node_2
--let $wsrep_cluster_address_orig = `SELECT @@wsrep_cluster_address`
SET GLOBAL wsrep_cluster_address = '';

--connection node_3
SET GLOBAL wsrep_cluster_address = '';

--connection node_1
INSERT INTO t1 (f2) SELECT REPEAT('a', 1000) FROM ten AS a1, ten AS a2, ten AS a3, ten AS a4, ten AS a5;

--connection node_2
--send_eval SET GLOBAL wsrep_cluster_address = '$wsrep_cluster_address_orig';

--sleep 1

--connection node_3
--send_eval SET GLOBAL wsrep_cluster_address = '$wsrep_cluster_address_orig';

--connection node_2
--reap

--connection node_3
--reap

--connection node_2
--source include/wait_until_connected_again.inc
--source include/galera_wait_ready.inc
SELECT VARIABLE_VALUE = 3 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME= 'wsrep_cluster_size';
SELECT COUNT(*) = 100000 FROM t1;

--connection node_3
--source include/wait_until_connected_again.inc
--source include/galera_wait_ready.inc
SELECT VARIABLE_VALUE = 3 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME= 'wsrep_cluster_size';
SELECT COUNT(*) = 100000 FROM t1;

--connection node_1
SELECT VARIABLE_VALUE = 3 FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME= 'wsrep_cluster_size';

DROP TABLE t1;
DROP TABLE ten;
philip-galera commented 9 years ago

Teemu says:

"I think there is room for bug here: Node A just finished donating IST to another member of the group when node B decides to join. If SYNC message from node A is ordered in new view after state UUID message, A does not update its state before sending state message and then B never learns that A got synced because SYNC message from A was ignored on B."

ronin13 commented 9 years ago

Other than the warning, are there any issues (affecting cluster) arising from this?

In case of concurrent joiners, are they handled like a FIFO?

janlindstrom commented 9 years ago

We see this also on MariaDB 10.1 and wsrep suite.

jan@jan-GE70-0NC-0ND ~/mysql/10.1-galera/mysql-test $ ./mysql-test-run --vardir=/dev/shm --suite=wsrep
Logging: ./mysql-test-run  --vardir=/dev/shm --suite=wsrep
vardir: /run/shm
Removing old var directory...
couldn't chmod(0777, /run/shm/sem.ADBE_REL_mdm): Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
Couldn't delete file '/run/shm/sem.ADBE_REL_mdm': Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
couldn't chmod(0777, /run/shm/sem.ADBE_WritePrefs_mdm): Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
Couldn't delete file '/run/shm/sem.ADBE_WritePrefs_mdm': Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
couldn't chmod(0777, /run/shm/sem.ADBE_ReadPrefs_mdm): Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
Couldn't delete file '/run/shm/sem.ADBE_ReadPrefs_mdm': Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 758.
couldn't chmod(0777, /run/shm): Operation not permitted at /usr/share/perl/5.18/File/Find.pm line 853.
Couldn't remove directory '/run/shm': Permission denied at /usr/share/perl/5.18/File/Find.pm line 853.
Creating var directory '/run/shm'...
Checking supported features...
MariaDB Version 10.1.3-MariaDB-wsrep-debug
 - SSL connections supported
 - binaries are debug compiled
Using suites: wsrep
Collecting tests...
JAN: %s
/home/jan/mysql/10.1-galera/scripts:/home/jan/mysql/10.1-galera/extra:/home/jan/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/gamesInstalling system database...
==============================================================================
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
wsrep.binlog_format 'innodb_plugin'      [ fail ]  Found warnings/errors in server log file!
        Test ended at 2015-01-29 11:41:09
line
150129 11:41:02 [Warning] WSREP: SYNC message from member 1 in non-primary configuration. Ignored.
^ Found warnings in /run/shm/log/mysqld.1.err
ok
line
150129 11:41:02 [Warning] WSREP: SYNC message from member 1 in non-primary configuration. Ignored.
^ Found warnings in /run/shm/log/mysqld.2.err
ok
 - saving '/run/shm/log/wsrep.binlog_format-innodb_plugin/' to '/run/shm/log/wsrep.binlog_format-innodb_plugin/'
Only  1  of 20 completed.
ayurchen commented 9 years ago

This is likely a result of a race between joiners joining and donor serving IST - it may finish dealing with the first joiner earlier that a new configuration is formed with the second one.

There is a room for a bug, but a very benign one: the need to send SYNC message is checked every time slave thread picks an action from the slave queue. If the first SYNC is lost, it will be sent again. Unless there are no actions left - which is impossible, because there is at least the new CC (from the 2nd node) coming.

What this message actually shows is that in this test joiners are not sufficiently concurrent and donor manages to finish the first IST before the second one connects to group.