behavior with secondary_check_script

ijin commented 12 years ago

According to the wiki, "If A was unsuccessful, masterha_secondary_check exits with return code 2 and MHA Manager guesses that network problem has happened and it does not start failover."

I thought masterha would give up failing over in this case.

When I tested the secondary_check_script with an unreachable address on purpose with masterha_secondary_check -s 10.119.45.30 -s 10.120.45.30 (10.119.45.30 is unreachable), then masterha seems to retry the failover on and on and on...

this repeats:

ssh: connect to host 10.119.45.30 port 22: No route to host Monitoring server 10.119.45.30 is NOT reachable! Sat Jul 14 06:04:52 2012 - [warning] At least one of monitoring servers is not reachable from this script. This is likely network problem. Failover should not happen.

Is this behavior expected?

yoshinorim commented 12 years ago

Yes, this is expected. The main purpose of this behavior is to avoid split brain.

ijin commented 12 years ago

great, thanks!

ijin commented 12 years ago

on a side note, perhaps this should be clarified more in the wiki.

yoshinorim / mha4mysql-manager

behavior with secondary_check_script #23