Open GoogleCodeExporter opened 9 years ago
Sorry, I'm not sure what happened with my copy/paste there. Here's what I
meant to post....
What steps will reproduce the problem?
1. Set up single master, single slave, single MHA manager
2. Setup configuration file...
user=root
password=<root mysql password>
ssh_user=mha
manager_workdir=/var/log/masterha/app1
remote_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1.log
master_ip_failover_script=/home/mha/scripts/mhascripts/master_ip_failover
master_ip_online_change_script=/home/mha/scripts/mhascripts/master_ip_online_cha
nge
[server1]
hostname=replmaster
master_binlog_dir=/log/mysql
[server2]
hostname=replrepl
master_binlog_dir=/log/mysql
3. establish replication and confirm with
masterha_check_ssh --conf=/etc/app1.cnf
and
masterha_check_ssh --conf=/etc/app1.cnf
4. do a manual failover
masterha_master_switch --conf=/etc/app1.cnf --master_state=alive
--new_master_host=replrepl --orig_master_is_new_slave
Get the following error....
e May 6 15:39:22 2014 - [info] Executed CHANGE MASTER.
Tue May 6 15:39:32 2014 - [error][/usr/lib/perl5/vendor_perl/MHA/Server.pm,
ln784] Slave could not be started on replmaster(10.10.30.46:3306)! Check slave
status.
Tue May 6 15:39:32 2014 - [error][/usr/lib/perl5/vendor_perl/MHA/Server.pm,
ln862] Starting slave IO/SQL thread on replmaster(10.10.30.46:3306) failed!
Tue May 6 15:39:32 2014 -
[error][/usr/lib/perl5/vendor_perl/MHA/MasterRotate.pm, ln573] Failed!
Tue May 6 15:39:32 2014 -
[error][/usr/lib/perl5/vendor_perl/MHA/MasterRotate.pm, ln602] Switching master
to replrepl(10.10.30.63:3306) done, but switching slaves partially failed.
5. Check show slave status on the server that is now supposed to be the
replicant....
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Connecting to master
Master_Host: replrepl
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 767
Relay_Log_File: replmaster-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 767
Relay_Log_Space: 107
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1045
Last_IO_Error: error connecting to master 'repl@replrepl:3306' - retry-time: 60 retries: 86400
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)
Looks like everything is correct, but it's not connecting to master
6. Reset the password using the change master syntax on the server that is
supposed to be the new slave
mysql> STOP SLAVE;
Query OK, 0 rows affected (0.00 sec)
mysql> CHANGE MASTER TO
-> MASTER_PASSWORD='<repl-password>';
Query OK, 0 rows affected (0.00 sec)
7. Start the slave again and everything is now working!
mysql> START SLAVE;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: replrepl
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 767
Relay_Log_File: replmaster-relay-bin.000002
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 767
Relay_Log_Space: 414
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)
What is the expected output? What do you see instead?
I expect the old master server to be converted to the new slave as expected
What version of the product are you using? On what operating system?
mha4mysql-node-0.56-0.el5
mha4mysql-manager-0.56-0.el5
CentOS release 5.10 (Final)
Please provide any additional information below.
After the problem occurs, I can fail back to replmaster with no issues
Here's a work around. For this situation I've failed back to the original
master so we can try this again from scratch. masterha_check_repl
--conf=/etc/app1.cnf cheks out with no problems.
On repl master you'll see if I run show slave status, there's nothing as
expected since it's the master.
mysql> SHOW SLAVE STATUS;
Empty set (0.00 sec)
Let's set it up with a master status as if it was going to replicate from
replrepl, but DON'T START IT
mysql> CHANGE MASTER TO
-> MASTER_HOST='replrepl',
-> MASTER_USER='repl',
-> MASTER_PASSWORD='repl password',
-> MASTER_PORT=3306,
-> MASTER_LOG_FILE='mysql-bin.000002',
-> MASTER_LOG_POS=107;
Query OK, 0 rows affected (0.00 sec)
Let's check the slave status again....
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: replrepl
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 107
Relay_Log_File: replmaster-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 107
Relay_Log_Space: 107
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)
Looks good. Let's try failing over...
> masterha_master_switch --conf=/etc/app1.cnf --master_state=alive
--new_master_host=replrepl --orig_master_is_new_slave
...
...
...
Tue May 6 15:52:42 2014 - [info] Switching master to
replrepl(10.10.30.63:3306) completed successfully.
Awesome, let's go to replrepl and check it's slave status (should be blank)
mysql> SHOW SLAVE STATUS;
Empty set (0.00 sec)
It is! However now I know if I manually fail back over to replmaster, repl
repl will have the same error listed above.
WORK AROUND:
Set up CHANGE MASTER STATUS on the master server so it has all the data of the
slave server, but don't start it. Apparently this will give it the replicaton
password info it needs to establish replication in a manual failover.
Original comment by peter.t....@gmail.com
on 6 May 2014 at 7:59
Hello, I encountered the same kind of issue and solved it by adding
repl_user=repl
repl_password=password to my config file.
Original comment by les...@pythian.com
on 8 Feb 2015 at 8:59
Yes, set repl_user and repl_password in MHA config file. Original master
doesn't have replication information so it doesn't know replication
username/password. --
https://code.google.com/p/mysql-master-ha/wiki/Parameters#repl_password
Original comment by Yoshinor...@gmail.com
on 9 Feb 2015 at 3:25
Original issue reported on code.google.com by
peter.t....@gmail.com
on 6 May 2014 at 7:56