Closed bluven closed 5 years ago
Hey, sorry to keep you waiting. I'm overloaded right now, and was hoping maybe someone from the community could chime in.
It's OK. I'm trying to reading the source code to figure it out, I can see it's really a hard job.
It looks to me like Orchestrator is unable to connect to the cluster master.
The log just seems to show orchestrator not able to read the master to me. Try looking through Audit -> Recovery and make sure previous recoveries are acknowledged. Unacknowledged recoveries can block an automated recovery if it's within' the configured time window.
Can you rebuild the environment such that the cluster is healthy then trigger a failover again? If you can, try this:
Then copy and upload the log file generated during these steps. That will help show what happened and possibly why. Right now the log file just indicates orchestrator is unable to connect. Possibly because mysql isn't running.
Ah, looking closer, I notice the following issues:
CHANGE MASTER TO
, you need to be using MASTER_AUTO_POSITION = 1
and not file and pos argumentsorchestrator-client -c register-candidate -i $(hostname) --promotion-rule prefer
. I'm unsure if this is a required atm, but it's not a bad idea to set up a cron to run that on master-candidate replicas@cclose Sorry for replying so late and Thank your reply. I had found that it's being skipped due to downtime master. After removing downtime record manually, recovery is executed. But there is another problem, I'll open another issue to describe it. I think this issue is ok to close.
@bluven iam facing similar issue in orchestrator. I shutdown my primary node..and orchestrator is not able to switch to one of my slave.
@khattarjitender05 best to open a new issue, I don't think your case is the same. Regardless:
--debug
and provide the logs around the failure detection time?
Sorry for this simple question, but I couldn't figure it out. I have read similiar issues.
I have a 3 nodes mysql cluster and I just want a failover. You known, after master is down, one of slaves is promoted to master and another slave take data from new master.
Here is my enviroment:https://github.com/bluven/mysql-replica
I put all the information which you asked in other similiar issues there, /tmp/recovery.log didn't have related information, so I skip it.
BTW. In fact, I wanted to open a issue about another kind failover problem, but it didn't happened because I changed orchestrator configuration. Should I open a new issue or just put it in the same issue.