bigcy / tungsten-replicator

Automatically exported from code.google.com/p/tungsten-replicator
0 stars 0 forks source link

Create a tungsten-replicator/scripts/tungsten_set_position.sh script to update the trep_commit_seqno table #684

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
1. To which tool/application/daemon will this feature apply?

Tungsten Replicator

2. Describe the feature in general

The script will accept a service name and THL metadata. Using that, it will 
update the trep_commit_seqno table directly. If the schema does not exist, it 
will be created.

3. Describe the feature interface

tungsten-replicator/scripts/set_trep_commit_seqno --service=xxxxx --seqno=#### 
(--host=xxxxx|--epoch=#### --sourceid=xxxxx [--eventid=####:####]) --dry-run

if '--host' is given, the script will read the metadata from the specified host

if '--dry-run' is given, the script will output the SQL statement instead of 
running it.

4. Give an idea (if applicable) of a possible implementation

This will use the Ruby script interface.

5. Describe pros and cons of this feature.

It will be easier to provide documentation that requires updating the 
trep_commit_seqno.

5a. Why the world will be a better place with this feature.

5b. What hardship will the human race have to endure if this feature is
implemented.

6. Notes

Original issue reported on code.google.com by jeff.m...@continuent.com on 27 Aug 2013 at 9:24

GoogleCodeExporter commented 9 years ago
Hi Jeff,

you might want to consider naming --dry-run a --sql or --sql-only. It would be 
more in-line with what we have in bin/thl.

Original comment by linas.vi...@continuent.com on 28 Aug 2013 at 9:43

GoogleCodeExporter commented 9 years ago
The initial script has been added but '--offline' & '--online' arguments should 
be added.

Original comment by jeff.m...@continuent.com on 2 Sep 2013 at 9:55

GoogleCodeExporter commented 9 years ago
There won't be a 2.1.3.

Original comment by linas.vi...@continuent.com on 17 Sep 2013 at 10:13

GoogleCodeExporter commented 9 years ago
tungsten_set_position.sh --seqno --host [--clear-logs] [--service]
tungsten_set_position.sh --seqno --epoch --event-id --source-id [--clear-logs] 
[--service]

Original comment by jeff.m...@continuent.com on 17 Sep 2013 at 4:42

GoogleCodeExporter commented 9 years ago

Original comment by jeff.m...@continuent.com on 18 Sep 2013 at 1:44

GoogleCodeExporter commented 9 years ago
I found two problems:

There is a --dry-run option that is actually accepted, but it does not work as 
expected.
It will complain that the replicator should be offline, and when the --offline 
option is added, the tool runs the command instead of showing it.
If --dry-run is used, it should do the same thing as the --sql command.

(2)
if you set a seqno that is higher than the seqno in the master, and use the 
--sql option, the tool does not complain. It will issue an error if you apply 
the change, but the simulation will not inform the user that there is an error.

Questions for the developer, and a good item for documenting:
* Is this tool supposed to be used on a master? 
* What is the likely scenario to use this tool on various roles?

Original comment by g.maxia on 18 Dec 2013 at 6:25

GoogleCodeExporter commented 9 years ago
I've added a --dry-run alias for --sql. We will address the issue of accepting 
invalid arguments in TR 2.2.1.

I am unable to reproduce the second issue.

tungsten@cdb1:~  $ thl info
log directory = /opt/continuent/thl/cdb/
log files = 1
logs size = 0.01 MB
min seq# = 0
max seq# = 9
events = 9
oldest file = thl.data.0000000001 (0.01 MB, 2013-12-19 00:20:40)
newest file = thl.data.0000000001 (0.01 MB, 2013-12-19 00:20:40)

tungsten@cdb2:~  $ tungsten_set_position --source=cdb1 --seqno=30 --offline
NOTE  >> Put cdb replication service offline
ERROR >> Unable to read the THL record for seqno 30 from cdb1
tungsten@cdb2:~  $ trepctl online
tungsten@cdb2:~  $ tungsten_set_position --source=cdb1 --seqno=30 --sql
ERROR >> Unable to read the THL record for seqno 30 from cdb1

Regarding documentation:
- Yes, this may be used on a master to set the initial extraction position
- This script is meant to replace any use of `trepctl online -base-seqno`, 
`trepctl online -from-eventid`, logging into mysql and modifying the 
trep_commit_seqno table.

The three new scripts will be used to recover failed masters. In this scenario 
db1 failed and db2 was promoted to master. These scripts will work in 
clustering and non-clustering topologies but their relationship is shown here.

If the `datasource recover` command does not work, the user may run `datasource 
restore` command. If they do not have a recent backup, the 
`tungsten_provision_slave` script may be used. If the dataset is so large that 
a backup/restore is not possible, they must repair the replication position.

The `trepctl status` output will indicate the epoch number of the new master. 
The user may run `tungsten_read_master_events` on db1 to see what events were 
written to it after <epoch-1>. If satisfied the replication position may be 
reverted by running `tungsten_set_position --source=db2 --seqno=<epoch -1>`.

This process will be unified in a future troubleshooting script but this is the 
basis for future automation.

Original comment by jeff.m...@continuent.com on 19 Dec 2013 at 12:31

GoogleCodeExporter commented 9 years ago
The issue #2 reported previously is not reproducible anymore. If the case 
presents itself, I will file a separate issue.

Original comment by g.maxia on 19 Dec 2013 at 6:22

GoogleCodeExporter commented 9 years ago
An entry has been added to the release notes for 2.2.0:

A new command-line tool, tungsten_set_position, has been created. This enables 
the position of either a master or slave to be set with respect to reading 
local or remote events. This provides easier control over during the recovery 
of a slave or master in the event of a failure.

Reference documentation is provided here: 
http://docs.continuent.com/tungsten-replicator-2.2/cmdline-tools-tungsten_set_po
sition.html

Original comment by mc.br...@continuent.com on 19 Dec 2013 at 2:21