matthewbogner / mysql-master-ha

Automatically exported from code.google.com/p/mysql-master-ha
1 stars 0 forks source link

"Failover should not happen" but it should? #82

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

In parts of my testing for using MHA instead of shutting down the mysql service 
I forcefully killed the server by doing a hard power off (like pulling the plug 
really, just on a virtual machine however).

The MHA manager correctly recognized it but it's now trying in loops to connect 
to it, stating it's not reachable and likely a network problem, hence failover 
should not happen. 

Though since I did kill off the master MySQL server, failover is what should 
happen.

This is an output from the log files. Is there further configuration required 
so that it will fail over in this case? the mha.cnf is further below

Log output:
Tue Apr 15 17:01:04 2014 - [warning] At least one of monitoring servers is not 
reachable from this script. This is likely a network problem. Failover should 
not happen.
Tue Apr 15 17:01:07 2014 - [warning] Got error on MySQL connect: 2013 (Lost 
connection to MySQL server at 'reading initial communication packet', system 
error: 113)
Tue Apr 15 17:01:07 2014 - [warning] Connection failed 3 time(s)..
Tue Apr 15 17:01:10 2014 - [warning] Got error on MySQL connect: 2013 (Lost 
connection to MySQL server at 'reading initial communication packet', system 
error: 113)
Tue Apr 15 17:01:10 2014 - [warning] Connection failed 4 time(s)..

this repeats over and over, stdout output:
Tue Apr 15 16:56:13 2014 - [warning] Global configuration file 
/etc/masterha_default.cnf not found. Skipping.
Tue Apr 15 16:56:13 2014 - [info] Reading application default configuration 
from /etc/mha1.cnf..
Tue Apr 15 16:56:13 2014 - [info] Reading server configuration from 
/etc/mha1.cnf..
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
Timeout, server 192.168.56.8 not responding.
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
ssh: connect to host 192.168.56.8 port 22: No route to host
...

The cnf file is:

[server default]
log_level=debug
manager_log=/var/log/masterha/masterha_default/masterha_default.log
manager_workdir=/var/log/masterha/masterha_default
master_binlog_dir=/var/log/mysqld/
user=$user
password=$password

secondary_check_script=masterha_secondary_check -s c6mhamst -s c6mhaslv1 -s 
c6mhaslv2
master_ip_failover_script=/root/master_ip_failover

ping_interval=3
remote_workdir=/var/log/masterha/masterha_default

[server1]
hostname=c6mhamst
ignore_fail=1

[server2]
hostname=c6mhaslv1
ignore_fail=1

[server3]
hostname=c6mhaslv2
ignore_fail=1

Cheers,

Achim

Original issue reported on code.google.com by achim.re...@rightster.com on 15 Apr 2014 at 4:28

GoogleCodeExporter commented 9 years ago
> secondary_check_script=masterha_secondary_check -s c6mhamst -s c6mhaslv1 -s 
c6mhaslv2

Would you please try removing the master host like below?
secondary_check_script=masterha_secondary_check -s c6mhaslv1 -s c6mhaslv2

-s <hostname> is a remote host to verify the master is dead, and the remote 
host has to be alive. "-s c6mhamst" means c6mhamst has to be alive, but I 
assume you are using the host as a master, so you shouldn't use master's 
hostname here.

https://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script

Original comment by Yoshinor...@gmail.com on 16 Apr 2014 at 12:09

GoogleCodeExporter commented 9 years ago
Hi,

That has indeed done the trick, thanks for your reply.

Cheers,

Achim

Original comment by achim.re...@rightster.com on 17 Apr 2014 at 10:59