Slave with more Seconds_Behind_Master is selected as master

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Set up a cluster with 1 master and its 2 slaves. Set host sequence in 
master_ha config file where master is followed by slave1 and slave2 and start 
masterha_manager for this cluster.
2. Stop the slave on slave1 so that sql thread is not running.
3. Start slave on slave1 when we have enough value of Seconds_Behind_Master (so 
that we can run failover before this value becomes 0) and slave2 has 
Seconds_Behind_Master=0. Shutdown master so that manager starts failing over.

What is the expected output? What do you see instead?
slave2 should be promoted as master. slave1 is promoted as master as per the 
default selection of master amongst existing slaves 
(http://code.google.com/p/mysql-master-ha/wiki/FAQ#Which_host_is_selected_as_a_n
ew_master?) and takes time before all the relay logs are applied to the new 
master.

What version of the product are you using? On what operating system?
mha4mysql-manager                      0.55-0                       
imha4mysql-node                           0.54-0                      
percona-server-server-5.5              5.5.32-rel31.0-549.squeeze
Linux version 2.6.32-5-amd64        (Debian 2.6.32-48squeeze3)

Please provide any additional information below.

Do we have any option which we can use to select master based on  
Exec_Master_Log_Pos where both the slaves are latest as per mysql-master-mha?

I have made a slight modification to achieve this though. I will be sending it 
to you in a separate mail.

Original issue reported on code.google.com by Gaurav.j...@gmail.com on 23 Sep 2013 at 9:36

GoogleCodeExporter commented 9 years ago

Could you paste MHA configuration file? If candidate_master=1 is set on slave1 
and not set on slave2, slave1 is chosen as a new master regardless of 
replication lag.

Original comment by Yoshinor...@gmail.com on 23 Sep 2013 at 10:43

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

Below is my the app.cnf entry:

[server default]
# working directory on the manager
manager_workdir=/var/log/masterha/test-ha-db
# manager log file
manager_log=/var/log/masterha/test-ha-db/test-ha-db.log
master_binlog_dir=/var/log/mysql
check_repl_filter=0

[server1]
hostname=test-ha-db1
ignore_fail=1

[server2]
hostname=test-ha-db2
ignore_fail=1

[server3]
hostname=test-ha-db3
ignore_fail=1

Below is global.cnf entry:

[server default]
user=root
password=xxxx
ssh_user=root
master_binlog_dir= /var/lib/mysql
remote_workdir=/home/mha/masterha/log/
ping_interval=3
master_ip_online_change_script=/etc/conf/masterha/master_ip_online_change

Original comment by Gaurav.j...@gmail.com on 24 Sep 2013 at 3:03

Added labels: ****
Removed labels: ****

GoogleCodeExporter commented 9 years ago

I even tried by removing "ignore_fail=1" from all servers but the result was 
same.

Original comment by Gaurav.j...@gmail.com on 24 Sep 2013 at 12:42

Added labels: ****
Removed labels: ****

huadaonan / mysql-master-ha

Slave with more Seconds_Behind_Master is selected as master #67