github / gh-ost

GitHub's Online Schema-migration Tool for MySQL
MIT License
12.44k stars 1.26k forks source link

Identify when the inspected server is restarted, and terminate #785

Open shlomi-noach opened 5 years ago

shlomi-noach commented 5 years ago

Context: 2nd scenario depicted in https://github.com/github/gh-ost/issues/784

In the 2nd scenario, what happened was that the inspected replica was rebooted. gh-ost kept hanging onto the end of the binary log where the replica was rebooted, and never rotated onto the next binlog. It also apparently retried enough times to see the replica healthy again.

gh-ost should identify when a server is restarted. At this time I'm not expecting it to recover and resume work -- instead I prefer that it terminates with error.

shlomi-noach commented 5 years ago

Addressed by #787 ; cc @github/database-infrastructure