Open GoogleCodeExporter opened 8 years ago
This is my all log
------------------------
Sun Feb 16 19:21:34 2014 - [warning] Got timeout on MySQL Ping child process
and killed it! at /usr/local/share/perl5/MHA/HealthCheck.pm line 384.
Sun Feb 16 19:21:34 2014 - [info] Executing SSH check script: save_binary_logs
--command=test --start_pos=4 --binlog_dir=/var/lib/mysql/
--output_file=/var/log/masterha/ksh/save_binary_logs_test
--manager_version=0.55 --binlog_prefix=mysql-bin
Sun Feb 16 19:21:37 2014 - [warning] Got error on MySQL connect: 2003 (Can't
connect to MySQL server on '192.168.1.71' (4))
Sun Feb 16 19:21:37 2014 - [warning] Connection failed 1 time(s)..
Sun Feb 16 19:21:39 2014 - [warning] HealthCheck: Got timeout on checking SSH
connection to 192.168.1.71! at /usr/local/share/perl5/MHA/HealthCheck.pm line
298.
Sun Feb 16 19:21:40 2014 - [warning] Got error on MySQL connect: 2003 (Can't
connect to MySQL server on '192.168.1.71' (4))
Sun Feb 16 19:21:40 2014 - [warning] Connection failed 2 time(s)..
Sun Feb 16 19:21:43 2014 - [warning] Got error on MySQL connect: 2003 (Can't
connect to MySQL server on '192.168.1.71' (4))
Sun Feb 16 19:21:43 2014 - [warning] Connection failed 3 time(s)..
Sun Feb 16 19:21:43 2014 - [warning] Master is not reachable from health
checker!
Sun Feb 16 19:21:43 2014 - [warning] Master 192.168.1.71(192.168.1.71:3306) is
not reachable!
Sun Feb 16 19:21:43 2014 - [warning] SSH is NOT reachable.
Sun Feb 16 19:21:43 2014 - [info] Connecting to a master server failed. Reading
configuration file /etc/masterha_default.cnf and /etc/ksh.cnf again, and trying
to connect to all servers to check server status..
Sun Feb 16 19:21:43 2014 - [info] Reading default configuratoins from
/etc/masterha_default.cnf..
Sun Feb 16 19:21:43 2014 - [info] Reading application default configurations
from /etc/ksh.cnf..
Sun Feb 16 19:21:43 2014 - [info] Reading server configurations from
/etc/ksh.cnf..
Sun Feb 16 19:21:43 2014 - [info] Dead Servers:
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:43 2014 - [info] Alive Servers:
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Sun Feb 16 19:21:43 2014 - [info] Alive Slaves:
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 19:21:43 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 19:21:43 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:43 2014 - [info] Checking slave configurations..
Sun Feb 16 19:21:43 2014 - [info] read_only=1 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 19:21:43 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 19:21:43 2014 - [info] read_only=1 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 19:21:43 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 19:21:43 2014 - [info] Checking replication filtering settings..
Sun Feb 16 19:21:43 2014 - [info] Replication filtering check ok.
Sun Feb 16 19:21:43 2014 - [info] Master is down!
Sun Feb 16 19:21:43 2014 - [info] Terminating monitoring script.
Sun Feb 16 19:21:43 2014 - [info] Got exit code 20 (Master dead).
Sun Feb 16 19:21:43 2014 - [info] MHA::MasterFailover version 0.55.
Sun Feb 16 19:21:43 2014 - [info] Starting master failover.
Sun Feb 16 19:21:43 2014 - [info]
Sun Feb 16 19:21:43 2014 - [info] * Phase 1: Configuration Check Phase..
Sun Feb 16 19:21:43 2014 - [info]
Sun Feb 16 19:21:43 2014 - [info] Dead Servers:
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:43 2014 - [info] Checking master reachability via mysql(double
check)..
Sun Feb 16 19:21:44 2014 - [info] ok.
Sun Feb 16 19:21:44 2014 - [info] Alive Servers:
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Sun Feb 16 19:21:44 2014 - [info] Alive Slaves:
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 19:21:44 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Version=5.5.35-enterprise-commercial-Sun Feb 16 18:37:26 2014 - [info]
MHA::MasterMonitor version 0.55.
Sun Feb 16 18:37:26 2014 - [info] Dead Servers:
Sun Feb 16 18:37:26 2014 - [info] Alive Servers:
Sun Feb 16 18:37:26 2014 - [info] 192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:37:26 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Sun Feb 16 18:37:26 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Sun Feb 16 18:37:26 2014 - [info] Alive Slaves:
Sun Feb 16 18:37:26 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 18:37:26 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:37:26 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 18:37:26 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:37:26 2014 - [info] Current Alive Master:
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:37:26 2014 - [info] Checking slave configurations..
Sun Feb 16 18:37:26 2014 - [info] read_only=1 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 18:37:26 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 18:37:26 2014 - [info] read_only=1 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 18:37:26 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 18:37:26 2014 - [info] Checking replication filtering settings..
Sun Feb 16 18:37:26 2014 - [info] binlog_do_db= , binlog_ignore_db=
Sun Feb 16 18:37:26 2014 - [info] Replication filtering check ok.
Sun Feb 16 18:37:26 2014 - [info] Starting SSH connection tests..
Sun Feb 16 18:39:14 2014 - [info] MHA::MasterMonitor version 0.55.
Sun Feb 16 18:39:15 2014 - [warning]
/var/log/masterha/ksh/ksh.master_status.health already exists. You might have
killed manager with SIGKILL(-9), may run two or more monitoring process for the
same application, or use the same working directory. Check for details, and
consider setting --workdir separately.
Sun Feb 16 18:39:15 2014 - [info] Dead Servers:
Sun Feb 16 18:39:15 2014 - [info] Alive Servers:
Sun Feb 16 18:39:15 2014 - [info] 192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:39:15 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Sun Feb 16 18:39:15 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Sun Feb 16 18:39:15 2014 - [info] Alive Slaves:
Sun Feb 16 18:39:15 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 18:39:15 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:39:15 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 18:39:15 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:39:15 2014 - [info] Current Alive Master:
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 18:39:15 2014 - [info] Checking slave configurations..
Sun Feb 16 18:39:15 2014 - [info] read_only=1 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 18:39:15 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.72(192.168.1.72:3306).
Sun Feb 16 18:39:15 2014 - [info] read_only=1 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 18:39:15 2014 - [warning] relay_log_purge=0 is not set on slave
192.168.1.73(192.168.1.73:3306).
Sun Feb 16 18:39:15 2014 - [info] Checking replication filtering settings..
Sun Feb 16 18:39:15 2014 - [info] binlog_do_db= , binlog_ignore_db=
Sun Feb 16 18:39:15 2014 - [info] Replication filtering check ok.
"/var/log/masterha/ksh/ksh.log" 862L, 70660C
Sun Feb 16 19:11:35 2014 - [info] Waiting until all relay logs are applied.
Sun Feb 16 19:11:35 2014 - [info] done.
Sun Feb 16 19:11:35 2014 - [info] Getting slave status..
Sun Feb 16 19:11:35 2014 - [info] This slave(192.168.1.73)'s
Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000003:565). No
need to recover from Exec_Master_Log_Pos.
Sun Feb 16 19:11:35 2014 - [info] Connecting to the target slave host
192.168.1.73, running recover script..
Sun Feb 16 19:11:35 2014 - [info] Executing command: apply_diff_relay_logs
--command=apply --slave_user='root' --slave_host=192.168.1.73
--slave_ip=192.168.1.73 --slave_port=3306
--apply_files=/var/log/masterha/ksh/saved_master_binlog_from_192.168.1.72_3306_2
0140216191105.binlog --workdir=/var/log/masterha/ksh
--target_version=5.5.35-enterprise-commercial-advanced-log
--timestamp=20140216191105 --handle_raw_binlog=1 --disable_log_bin=0
--manager_version=0.55 --slave_pass=xxx
Sun Feb 16 19:11:36 2014 - [info]
Applying differential binary/relay log files
/var/log/masterha/ksh/saved_master_binlog_from_192.168.1.72_3306_20140216191105.
binlog on 192.168.1.73:3306. This may take long time...
Applying log files succeeded.
Sun Feb 16 19:11:36 2014 - [info] All relay logs were successfully applied.
Sun Feb 16 19:11:36 2014 - [info] Resetting slave
192.168.1.73(192.168.1.73:3306) and starting replication from the new master
192.168.1.71(192.168.1.71:3306)..
Sun Feb 16 19:11:36 2014 - [info] Executed CHANGE MASTER.
Sun Feb 16 19:11:36 2014 - [info] Slave started.
Sun Feb 16 19:11:36 2014 - [info] End of log messages from 192.168.1.73.
Sun Feb 16 19:11:36 2014 - [info] -- Slave recovery on host
192.168.1.73(192.168.1.73:3306) succeeded.
Sun Feb 16 19:11:36 2014 - [info] All new slave servers recovered successfully.
Sun Feb 16 19:11:36 2014 - [info]
Sun Feb 16 19:11:36 2014 - [info] * Phase 5: New master cleanup phase..
Sun Feb 16 19:11:36 2014 - [info]
Sun Feb 16 19:11:36 2014 - [info] Resetting slave info on the new master..
Sun Feb 16 19:11:37 2014 - [info] 192.168.1.71: Resetting slave info succeeded.
Sun Feb 16 19:11:37 2014 - [info] Master failover to
192.168.1.71(192.168.1.71:3306) completed successfully.
Sun Feb 16 19:11:37 2014 - [info]
----- Failover Report -----
ksh: MySQL Master failover 192.168.1.72 to 192.168.1.71 succeeded
Master 192.168.1.72 is down!
Check MHA Manager logs at NEWm:/var/log/masterha/ksh/ksh.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.1.72.
The latest slave 192.168.1.71(192.168.1.71:3306) has all relay logs for
recovery.
Selected 192.168.1.71 as a new master.
192.168.1.71: OK: Applying all logs succeeded.
Failed to activate master IP address for 192.168.1.71 with return code 10:0
192.168.1.73: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.1.73: OK: Applying all logs succeeded. Slave started, replicating from
192.168.1.71.
192.168.1.71: Resetting slave info succeeded.
Master failover to 192.168.1.71(192.168.1.71:3306) completed successfully.
Sun Feb 16 19:17:41 2014 - [info] MHA::MasterMonitor version 0.55.
Sun Feb 16 19:21:43 2014 - [info] Starting master failover.
Sun Feb 16 19:21:43 2014 - [info]
Sun Feb 16 19:21:43 2014 - [info] * Phase 1: Configuration Check Phase..
Sun Feb 16 19:21:43 2014 - [info]
Sun Feb 16 19:21:43 2014 - [info] Dead Servers:
Sun Feb 16 19:21:43 2014 - [info] 192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:43 2014 - [info] Checking master reachability via mysql(double
check)..
Sun Feb 16 19:21:44 2014 - [info] ok.
Sun Feb 16 19:21:44 2014 - [info] Alive Servers:
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Sun Feb 16 19:21:44 2014 - [info] Alive Slaves:
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.72(192.168.1.72:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 19:21:44 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:44 2014 - [info] 192.168.1.73(192.168.1.73:3306)
Version=5.5.35-enterprise-commercial-advanced-log (oldest major version between
slaves) log-bin:enabled
Sun Feb 16 19:21:44 2014 - [info] Replicating from
192.168.1.71(192.168.1.71:3306)
Sun Feb 16 19:21:44 2014 - [info] ** Phase 1: Configuration Check Phase
completed.
Sun Feb 16 19:21:44 2014 - [info]
Sun Feb 16 19:21:44 2014 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Feb 16 19:21:44 2014 - [info]
Sun Feb 16 19:21:44 2014 - [info] Forcing shutdown so that applications never
connect to the current master..
Sun Feb 16 19:21:44 2014 - [info] Executing master IP deactivatation script:
Sun Feb 16 19:21:44 2014 - [info] /usr/local/bin/master_ip_failover
--orig_master_host=192.168.1.71 --orig_master_ip=192.168.1.71
--orig_master_port=3306 --command=stop
Got Error: Use of uninitialized value $ssh_user in concatenation (.) or string
at /usr/local/bin/master_ip_failover line 136.
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1
10.10.10.11/24===
Disabling the VIP on old master: 192.168.1.71
Sun Feb 16 19:21:44 2014 -
[error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln340] Failed to
deactivate master IP with return code 1:0
Sun Feb 16 19:21:44 2014 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm,
ln178] Got ERROR: at /usr/local/bin/masterha_manager line 65
Sun Feb 16 19:21:44 2014 - [info]
----- Failover Report -----
ksh: MySQL Master failover 192.168.1.71
Master 192.168.1.71 is down!
Check MHA Manager logs at NEWm:/var/log/masterha/ksh/ksh.log for details.
Started automated(non-interactive) failover.
Failed to deactivate master IP with return code 1:0
Got Error so couldn't continue failover from here.
------------------------------------------------------
Plz help me..
Original comment by au...@naver.com
on 17 Feb 2014 at 9:02
Original issue reported on code.google.com by
au...@naver.com
on 17 Feb 2014 at 8:46