Open GoogleCodeExporter opened 9 years ago
Original comment by linas.vi...@continuent.com
on 20 Mar 2013 at 1:39
This popped up in a customer's deployment. The way -base-seqno behaves on the
slave currently is counter-intuitive and dangerous:
1. Put slave offline.
2. Insert a few transactions on the master.
3. Try to put slave back online at the very last event (ignore all others that
were inserted): trepctl online -base-seqno <last_committed_seqno>.
4. Slave will go online successfully, but it will have applied *all* events,
effectively making -base-seqno useless.
Original comment by linas.vi...@continuent.com
on 13 Dec 2013 at 8:39
The option behavior is definitely confusing. Here's the current state of
things in the replicator.
-base-seqno : This option is designed to allow a master to regenerate the log
from a seqno value other than 0. It has no other use and should probably be
regarded as an error in any other case. It would also be helpful to rename it
as it is different from choosing a restart position.
-from-event: This option allows a master to read from the DBMS log (whether
binlog or Oracle CDC) from a particular position. When you enter this option
on a slave, it will cause the replicator to search forward in the log until it
can find the event ID. It's potentially a very slow operation. Looking at the
code, moreover, I don't see a simple way to force the replicator to start at a
particular seqno when searching forward.
There is no way to force a slave to start applying at a particular seqno using
online options. Moreover, slaves actually have two restart positions. There's
the position of the log and the position of the slave.
1. Log position. The THL is designed to avoid gaps in the log, as these create
ambiguity about whether we are skipping events or the log is corrupt. We don't
want to skip events here or if we do we need to create a filtered event to fill
in the gap. Computing the gap is a little tricky since the extractor that
pulls from the master does not know the current state of the log. Instead it
would have to say something like "I'm resetting the log position" and let a
downstream applier to the log figure out what to do, which includes handling
corner cases where the new seqno position leads to regenerating different
records with earlier seqno values. The log should catch trying to add earlier
seqnos but it would still create a nasty error.
2. Slave position. When you ask a slave to start at a particular position,
what you really want to do is set the position in the trep_commit_seqno so that
the slave reports a higher seqno than it would otherwise. We should probably
add an option to set the slave position explicitly. This would update the
trep_commit_seqno table just as just Jeff's script does now. One nice feature
would be to allow this to happen while the slave is offline, which would
potentially be less confusing than trying to make more options for the online
option.
For this reason, I think the existing tungsten_set_position script is a good
interim solution.
It's possible I'm missing something from the code but this is how things work
now. The fact is that the replicator is incomplete here so it would be a good
idea to fill things in a bit especially as regards the slave position. Making
things better should not be too hard.
Original comment by robert.h...@continuent.com
on 14 Dec 2013 at 6:44
The tungsten_set_position script was added in 2.2.0 for just this reason. It
currently only supports MySQL but can inspect the THL event on a remote server
and set the trep_commit_seqno table. It will also accept all necessary values
at the command line for use when setting the initial extraction position.
https://docs.continuent.com/continuent-tungsten-2.0/deployment-replicatorin.html
Original comment by jeff.m...@continuent.com
on 29 Jan 2014 at 3:17
Positioning is causing a lot of problems. We should fix this after data
sources are fully implemented in Tungsten Replicator 3.0. This work should
make it easier to address positioning issues.
Original comment by robert.h...@continuent.com
on 5 May 2014 at 11:16
Will not use third version digit for normal releases anymore. It will only be
increment for maintenance ones.
Original comment by linas.vi...@continuent.com
on 26 May 2014 at 5:01
Original comment by linas.vi...@continuent.com
on 2 Jun 2014 at 5:53
Original comment by linas.vi...@continuent.com
on 19 Jan 2015 at 2:18
Original issue reported on code.google.com by
linas.vi...@continuent.com
on 3 Aug 2012 at 6:33