codership / galera

Synchronous multi-master replication library
GNU General Public License v2.0
448 stars 176 forks source link

Joiner can't survive too frequent configuration changes #218

Open PrzemekMalkowski opened 9 years ago

PrzemekMalkowski commented 9 years ago

One node suffered network issues and disconnected and reconnected the cluster several times. After it finally managed to get IST transfer from other node, it failed with:

150127  0:19:16 [ERROR] WSREP: Local state seqno (557852431) is greater than group seqno (557850488): states diverged. Aborting to avoid potential data loss. Remove '/data/mysql/data//grastate.dat' file and restart if you wish to continue. (FATAL)
         at galera/src/replicator_str.cpp:state_transfer_required():33
150127  0:19:16 [Note] WSREP: applier thread exiting (code:8)

The version where it happened is PXC 5.5.37-rel35.0-25.10.756, Galera 2.10(r175). Three cluster nodes - 2 + 1 arbitrator.

JOINER err log:

(I skipped many lines but tried to paste as many as necessary to get some idea about network issues)

150127  0:17:23 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.134:4567' pointing to uuid 12824b44-9efb-11e4-bcc1-2b00f5a53f58 is blacklisted, skipping
150127  0:17:23 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.16.72.122:4567 tcp://10.16.72.126:4567 
150127  0:17:23 [Warning] WSREP: last inactive check more than PT1.5S ago (PT17.2873S), skipping check
150127  0:17:25 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') reconnecting to 29fccfb4-eed6-11e3-0800-36617919e986 (tcp://10.16.72.122:4567), attempt 0
150127  0:17:25 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') reconnecting to c3b08473-9ef9-11e4-8910-021ca5271274 (tcp://10.16.72.126:4567), attempt 0
150127  0:17:25 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.98448S), skipping check
150127  0:17:25 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.134:4567' pointing to uuid 12824b44-9efb-11e4-bcc1-2b00f5a53f58 is blacklisted, skipping
150127  0:17:25 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') turning message relay requesting off
150127  0:17:25 [Note] WSREP: evs::proto(12824b44-9efb-11e4-bcc1-2b00f5a53f58, OPERATIONAL, view_id(REG,12824b44-9efb-11e4-bcc1-2b00f5a53f58,64)) suspecting node: 29fccfb4-eed6-11e3-0800-36617919e986
150127  0:17:25 [Note] WSREP: evs::proto(12824b44-9efb-11e4-bcc1-2b00f5a53f58, OPERATIONAL, view_id(REG,12824b44-9efb-11e4-bcc1-2b00f5a53f58,64)) suspecting node: c3b08473-9ef9-11e4-8910-021ca5271274
150127  0:17:26 [Note] WSREP: view(view_id(NON_PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,64) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
} joined {
} left {
} partitioned {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
})
150127  0:17:26 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150127  0:17:29 [Note] WSREP: view(view_id(NON_PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,65) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
} joined {
} left {
} partitioned {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
})
150127  0:17:29 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.134:4567' pointing to uuid 12824b44-9efb-11e4-bcc1-2b00f5a53f58 is blacklisted, skipping
150127  0:17:29 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.16.72.122:4567 tcp://10.16.72.126:4567 
150127  0:17:29 [Note] WSREP: Flow-control interval: [16, 16]
150127  0:17:29 [Note] WSREP: Received NON-PRIMARY.
150127  0:17:29 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 557847539)
150127  0:17:29 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:29 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:29 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
(...)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:42 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150127  0:17:42 [Note] WSREP: Flow-control interval: [16, 16]
150127  0:17:42 [Note] WSREP: Received NON-PRIMARY.
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
150127  0:17:42 [Warning] WSREP: last inactive check more than PT1.5S ago (PT4.04713S), skipping check
150127  0:17:42 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') reconnecting to 29fccfb4-eed6-11e3-0800-36617919e986 (tcp://10.16.72.122:4567), attempt 0
150127  0:17:42 [Warning] WSREP: gcs_caused() returned -107 (Transport endpoint is not connected)
(...)
150127  0:18:09 [Note] WSREP: view(view_id(PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,68) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
} joined {
} left {
} partitioned {
})
150127  0:18:09 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
150127  0:18:09 [Note] WSREP: STATE_EXCHANGE: sent state UUID: e234b99c-a5e3-11e4-98db-f3e927a08a0a
150127  0:18:09 [Note] WSREP: STATE EXCHANGE: sent state msg: e234b99c-a5e3-11e4-98db-f3e927a08a0a
150127  0:18:09 [Note] WSREP: STATE EXCHANGE: got state msg: e234b99c-a5e3-11e4-98db-f3e927a08a0a from 0 (node01_dpadb1)
150127  0:18:09 [Note] WSREP: STATE EXCHANGE: got state msg: e234b99c-a5e3-11e4-98db-f3e927a08a0a from 2 (node02_dpadb2)
150127  0:18:09 [Note] WSREP: STATE EXCHANGE: got state msg: e234b99c-a5e3-11e4-98db-f3e927a08a0a from 1 (garb)
150127  0:18:09 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 61,
        members    = 2/3 (joined/total),
        act_id     = 557850488,
        last_appl. = 557847332,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe
150127  0:18:09 [Note] WSREP: Flow-control interval: [28, 28]
150127  0:18:09 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 557850488)
150127  0:18:18 [Warning] WSREP: last inactive check more than PT1.5S ago (PT5.35902S), skipping check
(...)
150127  0:18:52 [Note] WSREP: (12824b44-9efb-11e4-bcc1-2b00f5a53f58, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.16.72.122:4567 tcp://10.16.72.126:4567
(...)
150127  0:18:57 [Note] WSREP: State transfer required: 
        Group state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557849318
        Local state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557847539
150127  0:18:57 [Note] WSREP: New cluster view: global state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557849318, view# 60: Primary, number of nodes: 3, my index: 0, protocol version 2
150127  0:18:57 [Warning] WSREP: Gap in state sequence. Need state transfer.
150127  0:18:58 [Note] WSREP: view(view_id(NON_PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,68) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
} joined {
} left {
} partitioned {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
})
150127  0:18:58 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150127  0:18:58 [Note] WSREP: Flow-control interval: [16, 16]
150127  0:18:58 [Note] WSREP: Received NON-PRIMARY.
150127  0:18:58 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 557851771)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
(...)
150127  0:18:58 [Warning] WSREP: Failed to report last committed 557847473, -107 (Transport endpoint is not connected)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Note] WSREP: view(view_id(NON_PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,69) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
} joined {
} left {
} partitioned {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
})
150127  0:18:58 [Warning] WSREP: gcs_caused() returned -1 (Operation not permitted)
150127  0:18:58 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150127  0:18:58 [Note] WSREP: Flow-control interval: [16, 16]
150127  0:18:58 [Note] WSREP: Received NON-PRIMARY.
150127  0:18:58 [Note] WSREP: declaring 29fccfb4-eed6-11e3-0800-36617919e986 stable
150127  0:18:58 [Note] WSREP: declaring c3b08473-9ef9-11e4-8910-021ca5271274 stable
150127  0:18:58 [Note] WSREP: Node 29fccfb4-eed6-11e3-0800-36617919e986 state prim
150127  0:18:58 [Note] WSREP: view(view_id(PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,70) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
} joined {
} left {
} partitioned {
})
150127  0:18:58 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
150127  0:18:58 [Note] WSREP: STATE_EXCHANGE: sent state UUID: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc
150127  0:18:58 [Note] WSREP: STATE EXCHANGE: sent state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc
150127  0:18:58 [Note] WSREP: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 0 (node01_dpadb1)
150127  0:18:58 [Note] WSREP: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 1 (garb)
150127  0:18:58 [Note] WSREP: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 2 (node02_dpadb2)
150127  0:18:58 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 63,
        members    = 2/3 (joined/total),
        act_id     = 557852431,
        last_appl. = 557847332,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe
150127  0:18:58 [Note] WSREP: Flow-control interval: [28, 28]
150127  0:18:58 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 557852431)
150127  0:18:59 [Note] WSREP: killing local connection: 1157200
150127  0:18:59 [Note] WSREP: killing local connection: 1157199
(...)
150127  0:19:11 [Note] WSREP: You have configured 'xtrabackup-v2' state snapshot transfer method which cannot be performed on a running server. Wsrep provider won't be able to fall back to it if other means of state transfer are unavailable. In that case you will need to restart the server.
150127  0:19:11 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150127  0:19:15 [Note] WSREP: Assign initial position for certification: 557849318, protocol version: 2
150127  0:19:15 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.16.72.134:4568
150127  0:19:15 [Note] WSREP: Node 0 (node01_dpadb1) requested state transfer from '*any*'. Selected 2 (node02_dpadb2)(SYNCED) as donor.
150127  0:19:15 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 557853512)
150127  0:19:15 [Note] WSREP: Requesting state transfer: success, donor: 2
150127  0:19:15 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): discarded 0 bytes
150127  0:19:15 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): found 1/2861 locked buffers
150127  0:19:15 [Note] WSREP: Receiving IST: 1779 writesets, seqnos 557847539-557849318
150127  0:19:16 [Note] WSREP: 2 (node02_dpadb2): State transfer to 0 (node01_dpadb1) complete.
150127  0:19:16 [Note] WSREP: IST received: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557852431
150127  0:19:16 [Note] WSREP: 0 (node01_dpadb1): State transfer from 2 (node02_dpadb2) complete.
150127  0:19:16 [Note] WSREP: Shifting JOINER -> JOINED (TO: 557853568)
150127  0:19:16 [Note] WSREP: Member 2 (node02_dpadb2) synced with group.
150127  0:19:16 [Note] WSREP: New cluster view: global state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557849815, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
150127  0:19:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150127  0:19:16 [Note] WSREP: New cluster view: global state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557849815, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 2
150127  0:19:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150127  0:19:16 [Note] WSREP: Closing send monitor...
150127  0:19:16 [Note] WSREP: Closed send monitor.
150127  0:19:16 [Note] WSREP: gcomm: terminating thread
150127  0:19:16 [Note] WSREP: gcomm: joining thread
150127  0:19:16 [Note] WSREP: gcomm: closing backend
150127  0:19:16 [Note] WSREP: view(view_id(NON_PRIM,12824b44-9efb-11e4-bcc1-2b00f5a53f58,70) memb {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
} joined {
} left {
} partitioned {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
})
150127  0:19:16 [Note] WSREP: view((empty))
150127  0:19:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
150127  0:19:16 [Note] WSREP: gcomm: closed
150127  0:19:16 [Note] WSREP: Flow-control interval: [16, 16]
150127  0:19:16 [Note] WSREP: Received NON-PRIMARY.
150127  0:19:16 [Note] WSREP: Shifting JOINED -> OPEN (TO: 557853571)
150127  0:19:16 [Note] WSREP: Received self-leave message.
150127  0:19:16 [Note] WSREP: Flow-control interval: [0, 0]
150127  0:19:16 [Note] WSREP: Received SELF-LEAVE. Closing connection.
150127  0:19:16 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 557853571)
150127  0:19:16 [Note] WSREP: RECV thread exiting 0: Success
150127  0:19:16 [Note] WSREP: recv_thread() joined.
150127  0:19:16 [Note] WSREP: Closing replication queue.
150127  0:19:16 [Note] WSREP: Closing slave action queue.
150127  0:19:16 [ERROR] WSREP: Local state seqno (557852431) is greater than group seqno (557850488): states diverged. Aborting to avoid potential data loss. Remove '/data/mysql/data//grastate.dat' file and restart if you wish to continue. (FATAL)
         at galera/src/replicator_str.cpp:state_transfer_required():33
150127  0:19:16 [Note] WSREP: applier thread exiting (code:8)
150127  0:19:16 [Note] WSREP: starting shutdown
150127  0:19:16 [Note] /var/lib/mysql/softwares/Percona-XtraDB-Cluster-5.5.37-rel35.0-25.10.756.Linux.x86_64/bin/mysqld: Normal shutdown

150127  0:19:16 [Note] WSREP: Stop replication
150127  0:19:16 [Note] WSREP: Closing send monitor...
150127  0:19:16 [Note] WSREP: Closed send monitor.
150127  0:19:18 [Note] WSREP: killing local connection: 1157231
150127  0:19:18 [Note] WSREP: killing local connection: 1157230
150127  0:19:18 [Note] WSREP: killing local connection: 1157229
(...)

DONOR err log:

150127  0:19:15 [Note] WSREP: Node 0 (node01_dpadb1) requested state transfer from '*any*'. Selected 2 (node02_dpadb2)(SYNCED) as donor.
150127  0:19:15 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 557853512)
150127  0:19:15 [Note] WSREP: IST request: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557847539-557849318|tcp://10.16.72.134:4568
150127  0:19:15 [Note] WSREP: async IST sender starting to serve tcp://10.16.72.134:4568 sending 557847540-557852431
150127  0:19:16 [Note] WSREP: async IST sender served
150127  0:19:16 [Note] WSREP: 2 (node02_dpadb2): State transfer to 0 (node01_dpadb1) complete.
150127  0:19:16 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 557853568)
150127  0:19:16 [Note] WSREP: 0 (node01_dpadb1): State transfer from 2 (node02_dpadb2) complete.
150127  0:19:16 [Note] WSREP: Member 2 (node02_dpadb2) synced with group.
150127  0:19:16 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 557853568)
150127  0:19:16 [Note] WSREP: Synchronized with group, ready for connections
150127  0:19:16 [Note] WSREP: (c3b08473-9ef9-11e4-8910-021ca5271274, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.126:4567' pointing to uuid c3b08473-9ef9-11e4-8910-021ca5271274 is blacklisted, skipping
150127  0:19:16 [Note] WSREP: declaring 29fccfb4-eed6-11e3-0800-36617919e986 stable
150127  0:19:16 [Note] WSREP: (c3b08473-9ef9-11e4-8910-021ca5271274, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.16.72.134:4567
150127  0:19:16 [Note] WSREP: (c3b08473-9ef9-11e4-8910-021ca5271274, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.126:4567' pointing to uuid c3b08473-9ef9-11e4-8910-021ca5271274 is blacklisted, skipping
150127  0:19:16 [Note] WSREP: Node 29fccfb4-eed6-11e3-0800-36617919e986 state prim
150127  0:19:16 [Note] WSREP: view(view_id(PRIM,29fccfb4-eed6-11e3-0800-36617919e986,71) memb {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
} joined {
} left {
} partitioned {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
})
150127  0:19:16 [Note] WSREP: forgetting 12824b44-9efb-11e4-bcc1-2b00f5a53f58 (tcp://10.16.72.134:4567)
150127  0:19:16 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
150127  0:19:16 [Note] WSREP: (c3b08473-9ef9-11e4-8910-021ca5271274, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.126:4567' pointing to uuid c3b08473-9ef9-11e4-8910-021ca5271274 is blacklisted, skipping
150127  0:19:16 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
150127  0:19:16 [Note] WSREP: (c3b08473-9ef9-11e4-8910-021ca5271274, 'tcp://0.0.0.0:4567') turning message relay requesting off
150127  0:19:16 [Note] WSREP: STATE EXCHANGE: sent state msg: 09bfc088-a5e4-11e4-0800-f430947935d3
150127  0:19:16 [Note] WSREP: STATE EXCHANGE: got state msg: 09bfc088-a5e4-11e4-0800-f430947935d3 from 0 (garb)
150127  0:19:16 [Note] WSREP: STATE EXCHANGE: got state msg: 09bfc088-a5e4-11e4-0800-f430947935d3 from 1 (node02_dpadb2)
150127  0:19:16 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 64,
        members    = 2/2 (joined/total),
        act_id     = 557853571,
        last_appl. = 557853516,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe
150127  0:19:16 [Note] WSREP: Flow-control interval: [23, 23]
150127  0:19:16 [Note] WSREP: New cluster view: global state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557853571, view# 65: Primary, number of nodes: 2, my index: 1, protocol version 2
150127  0:19:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
150127  0:19:16 [Note] WSREP: Assign initial position for certification: 557853571, protocol version: 2
150127  0:19:21 [Note] WSREP:  cleaning up 12824b44-9efb-11e4-bcc1-2b00f5a53f58 (tcp://10.16.72.134:4567)

GARBD log:

2015-01-27 00:18:58.210  INFO: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 0 (node01_dpadb1)
2015-01-27 00:18:58.210  INFO: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 1 (garb)
2015-01-27 00:18:58.210  INFO: STATE EXCHANGE: got state msg: fefca897-a5e3-11e4-9b8d-5a7cd36e01dc from 2 (node02_dpadb2)
2015-01-27 00:18:58.210  INFO: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 63,
        members    = 2/3 (joined/total),
        act_id     = 557852431,
        last_appl. = 557851999,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe
2015-01-27 00:18:58.210  INFO: Flow-control interval: [9999999, 9999999]
2015-01-27 00:19:16.245  WARN: Protocol violation. JOIN message sender 2 (node02_dpadb2) is not in state transfer (SYNCED). Message ignored.
2015-01-27 00:19:16.246  WARN: Rejecting JOIN message from 0 (node01_dpadb1): new State Transfer required.
2015-01-27 00:19:16.263  INFO: declaring c3b08473-9ef9-11e4-8910-021ca5271274 stable
2015-01-27 00:19:16.263  INFO: (29fccfb4-eed6-11e3-0800-36617919e986, 'tcp://0.0.0.0:4567') address 'tcp://10.16.72.122:4567' pointing to uuid 29fccfb4-eed6-11e3-0800-36617919e986 is blacklisted, skipping
2015-01-27 00:19:16.263  INFO: (29fccfb4-eed6-11e3-0800-36617919e986, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.16.72.134:4567
2015-01-27 00:19:16.264  INFO: view(view_id(PRIM,29fccfb4-eed6-11e3-0800-36617919e986,71) memb {
        29fccfb4-eed6-11e3-0800-36617919e986,
        c3b08473-9ef9-11e4-8910-021ca5271274,
} joined {
} left {
} partitioned {
        12824b44-9efb-11e4-bcc1-2b00f5a53f58,
})
PrzemekMalkowski commented 9 years ago

i50320

ayurchen commented 9 years ago

Ok, what seems to be happening here is that the joiner can't survive too frequent configuration changes. "Too frequent" meaning that it disconnects and reconnects to group before being able to flush the slave queue. In this case JOINING state is cleared by (multiple) non-primary views which ruin the protective heuristics (effectively the node forgets that it was receiving state transfer)

How "too frequent" is in this case? Note the first PC in the joiner log:

150127  0:18:09 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 61,
        members    = 2/3 (joined/total),
        act_id     = 557850488,
        last_appl. = 557847332,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe

This is bottom layer, before slave queue. Notice act_id value. Only 48 seconds later the node processes PC at the top layer:

150127  0:18:57 [Note] WSREP: State transfer required: 
        Group state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557849318
        Local state: b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe:557847539

Notice group state seqno. It is lower than the act_id value above. That means that it is not even this PC event being processed But some earlier one, which is not even in the log. Apparently it happened so way back that was not even considered to be related.

And then another disconnect from PC and another reconnect:

150127  0:18:58 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 63,
        members    = 2/3 (joined/total),
        act_id     = 557852431,
        last_appl. = 557847332,
        protocols  = 0/4/2 (gcs/repl/appl),
        group UUID = b5a8d6fc-dbf5-11e3-ac98-a73ea7e811fe

And only after that the state request is sent:

150127  0:19:15 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.16.72.134:4568
150127  0:19:15 [Note] WSREP: Node 0 (node01_dpadb1) requested state transfer from '*any*'. Selected 2 (node02_dpadb2)(SYNCED) as donor.
150127  0:19:15 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 557853512)
150127  0:19:15 [Note] WSREP: Requesting state transfer: success, donor: 2

And the the first PC in the log is being processed only at

150127  0:19:16 [ERROR] WSREP: Local state seqno (557852431) is greater than group seqno (557850488): states diverged. Aborting to avoid potential data loss. Remove '/data/mysql/data//grastate.dat' file and restart if you wish to continue. (FATAL)
         at galera/src/replicator_str.cpp:state_transfer_required():33

This is clearly a bug. But in this case the node was clearly overloaded and probably suboptimally configured:

150127  0:18:18 [Warning] WSREP: last inactive check more than PT1.5S ago (PT5.35902S), skipping check

is a clear sign of that. One thing that can be done about it - increasing gcomm timeouts to prevent excessive cluster partitionings.