openark / orchestrator

MySQL replication topology management and HA
Apache License 2.0
5.64k stars 933 forks source link

Auto-failover splitting my slaves into multiple master cluster. #1390

Open AshDevilRed opened 3 years ago

AshDevilRed commented 3 years ago

Hello, i have an issue with the auto-failover. I work with a replication based on 1 master for 2 slaves. The failover is working, but when my master is down, the 2 slaves become master in different cluster. And i don't know why, i just want one slave to became master instead of the failed-master.

Version : Percona-server (MySQL version 8.0.22-13) Orchestrator (version 3.2.3)

Thanks for your time !

yangeagle commented 3 years ago

There is a problem with replication in debian-dbserv2.

2021-07-21 14:20:47 DEBUG - sorted replica: debian-dbserv1:3306 mysql-bin.000002:156
2021-07-21 14:20:47 DEBUG - sorted replica: debian-dbserv2:3306 :0

debian-dbserv2 can not change master to debian-dbserv1 and is lost.

2021-07-21 14:20:47 INFO topology_recovery: RecoverDeadMaster: - lost replica: debian-dbserv2:3306
AshDevilRed commented 3 years ago

Yes i saw that in the Orchestrator log. But the replication seems to be working greet, when i write anything on master, i can see it on the slaves. If i try to change the master of slave "dbserv2" to "dbserv1" with mysql commands is working great.

stop slave;
CHANGE MASTER TO MASTER_HOST="172.16.1.153",MASTER_PORT=3306,MASTER_USER='replic_user',MASTER_PASSWORD='test',MASTER_AUTO_POSITION=1;
start slave;

After that :

orchestrator-client -c topology -i dbservers
debian-dbserv0:3306     [0s,ok,8.0.25-15,rw,ROW,>>,GTID]
+ debian-dbserv1:3306   [0s,ok,8.0.25-15,ro,ROW,>>,GTID]
  + debian-dbserv2:3306 [0s,ok,8.0.25-15,ro,ROW,>>,GTID]

I think if I had an error in the replication this test would not have worked.

So if you have any idea of ​​the error I would like to know.