signal18 / replication-manager

Signal 18 repman - Replication Manager for MySQL / MariaDB / Percona Server
https://signal18.io/products/srm
GNU General Public License v3.0
649 stars 167 forks source link

"ERR00015" ,Could not get privileges for user #234

Open nigel889 opened 6 years ago

nigel889 commented 6 years ago

used replication-manager-pro-2.0.0_20_g97b7a-1.x86_64.rpm installed percona5.6.32 gtid mode = off 1master --> 2 slave master: 10.1.1.173 slave1: 10.1.1.171 slave2: 10.1.1.172

when start mrm , show 2 error logs: 1, "ERR00015", "Could not get privileges for user rpl on server 10.1.1.172:3306: No replication user defined. Please check the replication user is created with the required privileges"; 2, "ERR00005", "Could not get privileges for user mrm@PROXY-NODE-1: No replication user defined. Please check the replication user is created with the required privileges". PS: 1, At cluster all mysql hosts already have rpl user and grants : GRANT REPLICATION SLAVE ON . TO 'rpl'@'10.%' ; 2, only 10.1.1.172:3306: have in error log , other not 2, mrm user at all cluster hosts have all privelges. 3, PROXY-NODE-1 host is running MRM host's hostname how can fix the error log?

svaroqui commented 6 years ago

Hi Nigel,

Can you run with log-level=3 and send us the log ? tx

/svar

svaroqui commented 6 years ago

Other than that you can try starting the db nodes with skip-name-resolve that is a good practice anyway !

nigel889 commented 6 years ago

2018/05/16 12:10:28 [pub] INFO - Failover in interactive mode 2018/05/16 12:10:28 [pub] INFO - Loading 0 proxies 2018/05/16 12:10:28 [pub] DEBUG - Monitoring server loop 2018/05/16 12:10:28 [pub] DEBUG - Server [0]: URL: 10.1.1.171:3306 State: Suspect PrevState: Suspect 2018/05/16 12:10:28 [pub] DEBUG - Server [1]: URL: 10.1.1.172:3306 State: Suspect PrevState: Suspect 2018/05/16 12:10:28 [pub] DEBUG - Server [2]: URL: 10.1.1.173:3306 State: Suspect PrevState: Suspect 2018/05/16 12:10:28 [pub] DEBUG - State unconnected set by non-master rule on server 10.1.1.171:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] INFO - Set stateSlave from rejoin slave 10.1.1.172:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] INFO - Set stateSlave from rejoin slave 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Server 10.1.1.171:3306 was set master as last non slave 2018/05/16 12:10:28 [pub] DEBUG - Privilege check on 10.1.1.171:3306 2018/05/16 12:10:28 [pub] DEBUG - Client connection found on server 10.1.1.171:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:28 [pub] DEBUG - Server 10.1.1.172:3306 is configured as a slave 2018/05/16 12:10:28 [pub] DEBUG - Privilege check on 10.1.1.172:3306 2018/05/16 12:10:28 [pub] DEBUG - Client connection found on server 10.1.1.172:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:28 [pub] DEBUG - Server 10.1.1.173:3306 is configured as a slave 2018/05/16 12:10:28 [pub] DEBUG - Privilege check on 10.1.1.173:3306 2018/05/16 12:10:28 [pub] DEBUG - Client connection found on server 10.1.1.173:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] DEBUG - Checking if server 10.1.1.172 is a slave of server 10.1.1.171 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] DEBUG - Checking if server 10.1.1.173 is a slave of server 10.1.1.171 2018/05/16 12:10:28 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0060 : No semisync settings on master 10.1.1.171:3306 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0070 : No GTID strict mode on master 10.1.1.171:3306 2018/05/16 12:10:28 [pub] STATE - OPENED ERR00005 : Could not get privileges for user mrm@PROXY-NODE-1: No replication user defined. Please check the replication user is created with the required privileges 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0048 : No semisync settings on slave 10.1.1.172:3306 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0058 : No GTID strict mode on slave 10.1.1.172:3306 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0054 : No log of replication queries in slow query on slave 10.1.1.172:3306 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0062 : No Heartbeat <= 1s on master 10.1.1.171:3306 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0064 : No InnoDB durability on master 10.1.1.171:3306 2018/05/16 12:10:28 [pub] STATE - OPENED ERR00021 : All cluster db servers down 2018/05/16 12:10:28 [pub] STATE - OPENED ERR00015 : Could not get privileges for user rpl on server 10.1.1.172:3306: No replication user defined. Please check the replication user is created with the required privileges 2018/05/16 12:10:28 [pub] STATE - OPENED WARN0052 : No InnoDB durability on slave 10.1.1.172:3306 2018/05/16 12:10:30 [pub] DEBUG - Monitoring server loop 2018/05/16 12:10:30 [pub] DEBUG - Server [0]: URL: 10.1.1.171:3306 State: Master PrevState: StandAlone 2018/05/16 12:10:30 [pub] DEBUG - Server [1]: URL: 10.1.1.172:3306 State: Slave PrevState: Slave 2018/05/16 12:10:30 [pub] DEBUG - Server [2]: URL: 10.1.1.173:3306 State: Slave PrevState: Slave 2018/05/16 12:10:30 [pub] DEBUG - Master [ ]: URL: 10.1.1.171:3306 State: Master PrevState: StandAlone 2018/05/16 12:10:30 [pub] DEBUG - Slave [0]: URL: 10.1.1.172:3306 State: Slave PrevState: Slave 2018/05/16 12:10:30 [pub] DEBUG - Slave [1]: URL: 10.1.1.173:3306 State: Slave PrevState: Slave 2018/05/16 12:10:30 [pub] DEBUG - Server 10.1.1.171:3306 was set master as last non slave 2018/05/16 12:10:30 [pub] DEBUG - Privilege check on 10.1.1.171:3306 2018/05/16 12:10:30 [pub] DEBUG - Client connection found on server 10.1.1.171:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:30 [pub] DEBUG - Server 10.1.1.172:3306 is configured as a slave 2018/05/16 12:10:30 [pub] DEBUG - Privilege check on 10.1.1.172:3306 2018/05/16 12:10:30 [pub] DEBUG - Client connection found on server 10.1.1.172:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:30 [pub] DEBUG - Server 10.1.1.173:3306 is configured as a slave 2018/05/16 12:10:30 [pub] DEBUG - Privilege check on 10.1.1.173:3306 2018/05/16 12:10:30 [pub] DEBUG - Client connection found on server 10.1.1.173:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:30 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:30 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:30 [pub] DEBUG - Checking if server 10.1.1.172 is a slave of server 10.1.1.171 2018/05/16 12:10:30 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:30 [pub] DEBUG - Checking if server 10.1.1.173 is a slave of server 10.1.1.171 2018/05/16 12:10:30 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:30 [pub] STATE - RESOLV ERR00021 : All cluster db servers down 2018/05/16 12:10:30 [pub] STATE - OPENED WARN0007 : At least one server is not ACID-compliant. Please make sure that sync_binlog and innodb_flush_log_at_trx_commit are set to 1 2018/05/16 12:10:31 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:31 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:31 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:32 [pub] DEBUG - Lookup server 10.1.1.171:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:32 [pub] DEBUG - Lookup server 10.1.1.172:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:32 [pub] DEBUG - Lookup server 10.1.1.173:3306 if maxscale binlog server: 10.1.1.173:3306 2018/05/16 12:10:32 [pub] DEBUG - Monitoring server loop 2018/05/16 12:10:32 [pub] DEBUG - Server [0]: URL: 10.1.1.171:3306 State: Master PrevState: Master 2018/05/16 12:10:32 [pub] DEBUG - Server [1]: URL: 10.1.1.172:3306 State: Slave PrevState: Slave 2018/05/16 12:10:32 [pub] DEBUG - Server [2]: URL: 10.1.1.173:3306 State: Slave PrevState: Slave 2018/05/16 12:10:32 [pub] DEBUG - Master [ ]: URL: 10.1.1.171:3306 State: Master PrevState: Master 2018/05/16 12:10:32 [pub] DEBUG - Slave [0]: URL: 10.1.1.172:3306 State: Slave PrevState: Slave 2018/05/16 12:10:32 [pub] DEBUG - Slave [1]: URL: 10.1.1.173:3306 State: Slave PrevState: Slave 2018/05/16 12:10:32 [pub] DEBUG - Server 10.1.1.171:3306 was set master as last non slave 2018/05/16 12:10:32 [pub] DEBUG - Privilege check on 10.1.1.171:3306 2018/05/16 12:10:32 [pub] DEBUG - Client connection found on server 10.1.1.171:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:32 [pub] DEBUG - Server 10.1.1.172:3306 is configured as a slave 2018/05/16 12:10:32 [pub] DEBUG - Privilege check on 10.1.1.172:3306 2018/05/16 12:10:32 [pub] DEBUG - Client connection found on server 10.1.1.172:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:32 [pub] DEBUG - Server 10.1.1.173:3306 is configured as a slave 2018/05/16 12:10:32 [pub] DEBUG - Privilege check on 10.1.1.173:3306 2018/05/16 12:10:32 [pub] DEBUG - Client connection found on server 10.1.1.173:3306 with IP 10.1.1.174 for host 10.1.1.174 2018/05/16 12:10:32 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:32 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:32 [pub] DEBUG - Checking if server 10.1.1.172 is a slave of server 10.1.1.171 2018/05/16 12:10:32 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171 2018/05/16 12:10:32 [pub] DEBUG - Checking if server 10.1.1.173 is a slave of server 10.1.1.171 2018/05/16 12:10:32 [pub] DEBUG - GetMasterFromReplication server 171 lookup if server 10.1.1.171:3306 is the one : 171

svaroqui commented 6 years ago

There is an error return by this function : priv, err := dbhelper.GetPrivileges(server.Conn, server.ClusterGroup.dbUser, server.ClusterGroup.repmgrHostname, myip) with myip = 10.1.1.174

The function call such prepare statement and in theory your grant 10.% is test tested

splitip := strings.Split(ip, ".")

iprange1 := splitip[0] + ".%.%.%"
iprange2 := splitip[0] + "." + splitip[1] + ".%.%"
iprange3 := splitip[0] + "." + splitip[1] + "." + splitip[2] + ".%"
stmt := "SELECT MAX(Select_priv) as Select_priv, MAX(Process_priv) as Process_priv, MAX(Super_priv) as Super_priv, MAX(Repl_slave_priv) as Repl_slave_priv, MAX(Repl_client_priv) as Repl_client_priv, MAX(Reload_priv) as Reload_priv FROM mysql.user WHERE user = ? AND host IN(?,?,?,?,?,?,?,?,?)"
row := db.QueryRowx(stmt, user, host, ip, "%", ip+"/255.0.0.0", ip+"/255.255.0.0", ip+"/255.255.255.0", iprange1, iprange2, iprange3)
err := row.StructScan(&priv)
if err != nil && strings.Contains(err.Error(), "unsupported Scan") {
    return priv, errors.New("No replication user defined. Please check the replication user is created with the required privileges")
}

We can see in your log the "No replication user defined. Please check the replication user is created with the required privileges"

can you send me a describe of the mysql.user table on Percona may be there are differences with other flavor i 'm not aware?

what is the exact entry for (password obfuscated) are you using, and would your password have any special characters that can cause this statement to fail?

svaroqui commented 6 years ago
svaroqui commented 6 years ago

Nigel, also note that replication-manager-pro stand for provisioning and is used with opensvc for docker deployments is that what you would like to test ?

If it is the case you should build an opensvc cluster of agent before using and use our public collector to get databases compliances !

svaroqui commented 6 years ago

For other scenarios the -osc release is the one you should use !

nigel889 commented 6 years ago

1, select * from mysql.user where User in ('mrm','rpl')\G 1. row Host: 10.% User: rpl Password: *D37209741C8249F4589CC9D0898DEC00 Select_priv: N Insert_priv: N Update_priv: N Delete_priv: N Create_priv: N Drop_priv: N Reload_priv: N Shutdown_priv: N Process_priv: N File_priv: N Grant_priv: N References_priv: N Index_priv: N Alter_priv: N Show_db_priv: N Super_priv: N Create_tmp_table_priv: N Lock_tables_priv: N Execute_priv: N Repl_slave_priv: Y Repl_client_priv: N Create_view_priv: N Show_view_priv: N Create_routine_priv: N Alter_routine_priv: N Create_user_priv: N Event_priv: N Trigger_priv: N Create_tablespace_priv: N ssl_type: ssl_cipher: x509_issuer: x509_subject: max_questions: 0 max_updates: 0 max_connections: 0 max_user_connections: 0 plugin: mysql_native_password authentication_string: password_expired: N 2. row Host: 10.1.% User: mrm Password: *D0D15496379A0DD9D7AA3CEA17E1 Select_priv: Y Insert_priv: Y Update_priv: Y Delete_priv: Y Create_priv: Y Drop_priv: Y Reload_priv: Y Shutdown_priv: Y Process_priv: Y File_priv: Y Grant_priv: N References_priv: Y Index_priv: Y Alter_priv: Y Show_db_priv: Y Super_priv: Y Create_tmp_table_priv: Y Lock_tables_priv: Y Execute_priv: Y Repl_slave_priv: Y Repl_client_priv: Y Create_view_priv: Y Show_view_priv: Y Create_routine_priv: Y Alter_routine_priv: Y Create_user_priv: Y Event_priv: Y Trigger_priv: Y Create_tablespace_priv: Y ssl_type: ssl_cipher: x509_issuer: x509_subject: max_questions: 0 max_updates: 0 max_connections: 0 max_user_connections: 0 plugin: `mysql_native_password authentication_string: password_expired: N 2, replication-manager-cli api --url="https://127.0.0.1:10005/api/clusters/pub/topology/servers" [Uploading 1.txt…] 3,replication-manager-cli api --url="https://127.0.0.1:10005/api/clusters/pub/settings" Enter Password: { "enterprise": "true", "interactive": "true", "failoverctr": "0", "maxdelay": "30", "faillimit": "3", "lastfailover": "N/A", "monheartbeats": "3086", "uptime": "0.00025", "uptimefailable": "0.00025", "uptimesemisync": "0.00000", "rplchecks": "true", "failsync": "false", "switchsync": "false", "verbose": "true", "rejoin": "true", "rejoinbackupbinlog": "true", "rejoinsemisync": "true", "rejoinflashback": "false", "rejoinunsafe": "false", "rejoindump": "false", "rejoinpseudogtid": "", "test": "true", "heartbeat": "false", "runstatus": "A", "isactive": "", "confgroup": "pub", "monitoringticker": "2", "failresettime": "0", "tosessionend": "3600", "httpauth": "false", "httpbootstrapbutton": "false", "graphitemetrics": "", "clusters": [ "pub" ], "regtests": [ "testSwitchoverAllSlavesDelayMultimasterNoRplChecksNoSemiSync", "testSwitchoverLongTransactionNoRplCheckNoSemiSync", "testSwitchoverLongQueryNoRplCheckNoSemiSync", "testSwitchoverLongTrxWithoutCommitNoRplCheckNoSemiSync", "testSwitchoverReadOnlyNoRplCheck", "testSwitchoverNoReadOnlyNoRplCheck", "testSwitchover2TimesReplicationOkNoSemiSyncNoRplCheck", "testSwitchover2TimesReplicationOkSemiSyncNoRplCheck", "testSwitchoverBackPreferedMasterNoRplCheckSemiSync", "testSwitchoverAllSlavesStopRplCheckNoSemiSync", "testSwitchoverAllSlavesStopNoSemiSyncNoRplCheck", "testSwitchoverAllSlavesDelayRplCheckNoSemiSync", "testSwitchoverAllSlavesDelayNoRplChecksNoSemiSync", "testFailoverSemisyncAutoRejoinSafeMSMXMS", "testFailoverSemisyncAutoRejoinSafeMSXMSM", "testFailoverSemisyncAutoRejoinSafeMSMXXXRMXMS", "testFailoverSemisyncAutoRejoinSafeMSMXXXRXSMS", "testFailoverSemisyncAutoRejoinUnsafeMSMXMS", "testFailoverSemisyncAutoRejoinUnsafeMSMXXXMXMS", "testFailoverSemisyncAutoRejoinUnsafeMSMXXXXMSM", "testFailoverSemisyncAutoRejoinUnsafeMSXMSM", "testFailoverSemisyncAutoRejoinUnsafeMSXMXXMXMS", "testFailoverSemisyncAutoRejoinUnsafeMSXMXXXMSM", "testFailoverSemisyncAutoRejoinUnsafeMSMXXXRMXMS", "testFailoverSemisyncAutoRejoinUnsafeMSMXXXRXMSM", "testFailoverAssyncAutoRejoinRelay", "testFailoverAssyncAutoRejoinNoGtid", "testFailoverAllSlavesDelayNoRplChecksNoSemiSync", "testFailoverAllSlavesDelayRplChecksNoSemiSync", "testFailoverNoRplChecksNoSemiSync", "testFailoverNoRplChecksNoSemiSyncMasterHeartbeat", "testFailoverNumberFailureLimitReach", "testFailoverTimeNotReach", "testFailoverManual", "testFailoverAssyncAutoRejoinFlashback", "testFailoverSemisyncAutoRejoinFlashback", "testFailoverAssyncAutoRejoinNowrites", "testFailoverSemisyncAutoRejoinMSSXMSXXMSXMSSM", "testFailoverSemisyncAutoRejoinMSSXMSXXMXSMSSM", "testFailoverSemisyncSlavekilledAutoRejoin", "testSlaReplAllSlavesStopNoSemiSync", "testSlaReplAllSlavesDelayNoSemiSync" ], "topology": "master-slave", "version": "", "databasetags": null, "proxytags": null }

svaroqui commented 6 years ago

Sorry the attachement as failed for some reason, may size or file format json unsupported can you send it to me stephane@signal18.io with a dump of the user table so that i can recreate it for test

nigel889 commented 6 years ago

@svaroqui ok

svaroqui commented 6 years ago

yop so the user passed to the function is "mrm" not surprise here ! the dump of mysql.user can help to reproduce

svaroqui commented 6 years ago

There is sql error log plugin in MariaDB that do not exist yet in Percona to track isifreplication-manager send wrong SQL command, other than that you can activate database general log and see the queries sent by replication-manager , i'm intrested in seeing the one SELECT MAX(Select_priv) as Select_priv, MAX(Process_priv) as Process_priv, with argument pass to it ?

nigel889 commented 6 years ago

3 hosts only report "10.1.1.172:3306 " : Could not get privileges for user rpl on server 10.1.1.172:3306: No replication user defined. Please check the replication user is created with the required privileges

svaroqui commented 6 years ago

Could it be that you did not grant but modify the underlying tables without flush privileges

nigel889 commented 6 years ago

used replication-manager-pro-2.0.0_20_g97b7a-1.x86_64.rpm installed percona5.6.32 gtid mode = off 1master --> 2 slave master: 10.1.1.173 slave1: 10.1.1.171 slave2: 10.1.1.172 failover-mode = "automatic" or "Manual" autorejoin = "false" or "true " failover fail: when i stop mysql service on 10.1.1.173, new master is 10.1.1.171, but
10.1.1.172 mrm state is Maintenance and on it mysql>"show slave status \G " showed : "Master_Host: 10.1.1.173" " Last_IO_Error: error connecting to master 'rpl@10.1.1.173:3306' - retry-time: 5 retries: 19" when i Manual restart mysql on 10.1.1.173 , "Master_Host: 10.1.1.173" change to "Master_Host: 10.1.1.171" and mrm state is slave running. why?

ps: replication-manager-cli switchover working fine

svaroqui commented 6 years ago

It's because you are not using GTID or PSEUDO GTID and so the 10.1.1.172 stay under the old master until the 10.1.1.173 old master is restarted , without GTID they is no way to make positional change of an extra slave as the position of replication is unknowed to the new master

nigel889 commented 6 years ago

which variables can set the auto Rejoin PSEUDO GTID in config.toml? when use auto Rejoin PSEUDO GTID, when old master restart , it will become a standalone master? thx

svaroqui commented 6 years ago

yes standalone or reattached to cluster if possible via rejoin method. The variable is --autorejoin-slave-positional-hearbeat