epermana / tungsten-replicator

Automatically exported from code.google.com/p/tungsten-replicator
1 stars 0 forks source link

auto-recovery stops after only one try #1119

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. configure auto_recovery parameters eg:

 "repl_auto_recovery_delay_interval": "20s",
 "repl_auto_recovery_max_attempts": "3",
 "repl_auto_recovery_reset_interval": "80s"

2. stop the database (mysql)

3. wait until after two times auto-recovery-delay-interval

4. start the database

What is the expected output?

The replicator goes offline when mysql is stopped,
then tries twice to reconnect, and on the third try it goes back online
because mysql is up again.

What do you see instead?

The replicator tries to reconnect only once. Then
only a manual trepctl online will put the replicator back online.

This is the behaviour described in issue 784
https://code.google.com/p/tungsten-replicator/issues/detail?id=784
and, in a less obvious way, in the manual
http://docs.continuent.com/tungsten-replicator-4.0/operations-autorecovery.html

But it does not make much sense:
If a manual "treplctl online" works why should not an automatic reconnection
be done as well?
The argument of issue 784 applies here as well:
"A lot of users write watchdog scripts to handle these, which seems like a 
waste of time."

What version of the product are you using?

./tools/tpm query version
4.0.0-18

mysql -V
mysql  Ver 14.14 Distrib 5.5.44, for debian-linux-gnu (x86_64) using readline 
6.3

On what operating system?

head -1 /etc/os-release
PRETTY_NAME="Debian GNU/Linux 8 (jessie)"

Please provide any additional information below.

Notes:

See a full description, explanation and remarks in Tungsten Replicator Discuss:
https://groups.google.com/d/msg/tungsten-replicator-discuss/7mHu0sXarto/XeHS5wN3
AgAJ

Original issue reported on code.google.com by Dominiqu...@unige.ch on 14 Aug 2015 at 11:19