Open lichunzhu opened 3 years ago
an user case:
master somewhat get panicked, and it was restarted by systemd. after restart, due to https://github.com/pingcap/ticdc/issues/3828 , it was retired when Optimist.Start. so there're two nodes writing etcd: itself and the new leader
/severity minor
Bug Report
Please answer these questions before submitting your issue. Thanks!
What did you do? If possible, provide a recipe for reproducing the error. When one dm-master resigns from etcd leader, the signal sent to this dm-master may arrive later than the other dm-master received the info to become etcd leader. This will lead two leaders to operate etcd keys at the same time. This may cause some problems in dm.
What did you expect to see? There is always only one leader operating the etcd keys.
Possible solution: For every etcd write in dm, we can add an if condition to check whether this etcd key equals to etcd leader key before each write transaction.