hugowan / maatkit

Automatically exported from code.google.com/p/maatkit
0 stars 0 forks source link

Make mk-table-sync force statement-based binlog format on 5.1 and greater #95

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
If this is a problem report, what steps will reproduce the problem?
1. If using the --replicate or --synctomaster options, statements to change
the slave are done via replication.
2. If the diff algorithm determines that there are records on slave to be
deleted it generates the proper delete statement and executes it on the master.
3. The statement is not replicated because no changes occur on the master.

What is the expected output? What do you see instead?

NA

What information do you get from mk-<toolname>--version?

mk-table-sync  Ver 1.0.9 Distrib 2325 Changeset 2311

What is your MySQL, Perl, DBI, and DBD::mysql version?

5.1.28-rc-log

What is your operating system/distribution?

linux

If you think it'll be useful, please run the tool with MKDEBUG=1 and
capture the full output in a file, then attach the file.  For example,
MKDEBUG=1 mk-<toolname> [options] > debug.txt.  Is there anything in this
file that you think we should pay special attention to?  Please gzip the
file if it's large.

Please provide any additional information below.

I've been able to work around the issue by patching the code.  The patch is
not really the appropriate fix, but does allow it to work for my specific
situation.  The appropriate fix appears to be a bit more complex due to the
closure that is being used to make the changes in
TableSyncer::sync_table().  I've attached the patch.

Original issue reported on code.google.com by vwch...@gmail.com on 7 Oct 2008 at 2:02

Attachments:

GoogleCodeExporter commented 9 years ago
Is the issue related to row-based replication being used?  Can this be solved 
just by
setting the logging type to statement?  I believe that should work.

Original comment by baron.schwartz on 8 Oct 2008 at 1:24

GoogleCodeExporter commented 9 years ago
I believe that I had the server explicitly set to use statement based 
replication at
the time, but I cannot be positive about that.  I will test again and see.  
That being said, is there any reason not to explicitly run the delete statements
against the slave server rather than have them replicate?

Original comment by vwch...@gmail.com on 8 Oct 2008 at 1:35

GoogleCodeExporter commented 9 years ago
The biggest reason to run all the statements on the master is to avoid race
conditions and messing up the slave's data because of out-of-sync writes.  Even 
if
everything is perfectly synced with locks, which adds a lot of cost and delay, 
the
other statements happening on the master at the same time may interleave and 
cause
problems once they replicate to the slave.  (Assuming syncing is happening on a 
live
server that can't be taken down, which is the usual case at Percona when 
someone asks
us to fix their critical production server.) There are also a bunch of nasty 
cases
where the slave could get slightly delayed and the constant locking and waiting 
could
make the whole process unbearably slow.

Original comment by baron.schwartz on 8 Oct 2008 at 1:52

GoogleCodeExporter commented 9 years ago
Very well, that sounds reasonable.  Perhaps the solution would be for the 
application
to explicitly set the session variable binlog_format = 'STATEMENT' when applying
changes?  I know that they have just modified that as of 5.1.28 to only allow a 
user
with the SUPER priv to be able to set the binlog_format session variable, but 
that
should be ok as the dba is most likely the one to be using this tool.

Original comment by vwch...@gmail.com on 8 Oct 2008 at 2:07

GoogleCodeExporter commented 9 years ago
That's good to know about SUPER.  That's a very odd change, is there a bug 
report on
bugs.mysql.com or something to justify this?

mk-table-checksum already sets the binlog_format to accomplish this.  I guess 
I've
not used mk-table-sync on 5.1 enough to have noticed this.

Original comment by baron.schwartz on 8 Oct 2008 at 3:00

GoogleCodeExporter commented 9 years ago
I misquoted the version that the SUPER privilege is needed.  It appears in 
5.1.29
(not yet released, but I have been doing testing with the bzr code).

http://bugs.mysql.com/bug.php?id=39106
http://dev.mysql.com/doc/refman/5.1/en/news-5-1-29.html

Original comment by vwch...@gmail.com on 8 Oct 2008 at 3:20

GoogleCodeExporter commented 9 years ago
Revisiting this issue: so was the problem/solution as Baron said in comment #1: 
set
statement-based replication instead of row-based?  Or is there still an issue 
here?

Original comment by dan...@percona.com on 27 Aug 2009 at 1:56

GoogleCodeExporter commented 9 years ago
I think that setting statement replication did fix the issue.

Original comment by vwch...@gmail.com on 27 Aug 2009 at 11:24

GoogleCodeExporter commented 9 years ago
Ok.  In that case I'll close this issue for now.  If it's found later that there
really is a bug, please let us know.  Thanks.

Original comment by dan...@percona.com on 28 Aug 2009 at 4:20

GoogleCodeExporter commented 9 years ago
Still, mk-table-sync should set the binlog_format to statement for the duration 
of
the run to ensure that changes take effect, I think?

Original comment by vwch...@gmail.com on 28 Aug 2009 at 6:39

GoogleCodeExporter commented 9 years ago
I agree, mk-table-sync relies on statement-based logging.

Summary was: mk-table-sync: DELETE statements do not work for replication

Original comment by baron.schwartz on 28 Aug 2009 at 9:53

GoogleCodeExporter commented 9 years ago

Original comment by dan...@percona.com on 2 Sep 2009 at 2:26

GoogleCodeExporter commented 9 years ago
Issue 817 has been merged into this issue.

Original comment by baron.schwartz on 24 Jan 2010 at 4:10

GoogleCodeExporter commented 9 years ago

Original comment by baron.schwartz on 31 Jan 2010 at 1:20

GoogleCodeExporter commented 9 years ago

Original comment by baron.schwartz on 1 Feb 2010 at 1:36

GoogleCodeExporter commented 9 years ago

Original comment by baron.schwartz on 1 Feb 2010 at 3:05

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r5824.

Original comment by dan...@percona.com on 19 Feb 2010 at 8:37

GoogleCodeExporter commented 9 years ago
I'm not sure this is or should be considered fixed at this point, as it now has 
required SUPER privileges which is not always desirable.  Yes this tool CAN be 
used to sync replication slaves, but it could also be used to sync data from 
one table to another on a timed interval (ie: production database versus 
statistical database) and can replicate individual tables (which MySQL 
replication can not).

What this has done is prevent an ordinary user (on a shared hosting or managed 
hosting environment) from using this tool to sync one table with another.

As disabling it has little effect if replication is not used, this should be a 
toggled option (with a friendly help message) to not do this is the user is 
running as an ordinary user.  The inclusion of an option to run without 
administrative privs and an if statement wrapping around this is quite 
necessary.

Why exclude the market to only replication?

Original comment by ph...@rogers.com on 23 Mar 2011 at 12:34

GoogleCodeExporter commented 9 years ago
I think you are right, we forgot about the diversity of use cases for this 
tool.  We need to make it change the binlog format and ask for SUPER privilege 
only when replication-based syncing is used.

Original comment by baron.schwartz on 23 Mar 2011 at 7:17

GoogleCodeExporter commented 9 years ago
barron.., on that note, a very related bug (that likely copies the same code)

Access denied; you need (at least one of) the SUPER,REPLICATION CLIENT 
privilege(s) for this operation [for Statement "SHOW SLAVE STATUS"] at line 
6070 while doing ...

The command is simply 
# mk-table-sync --dry-run --execute h=localhost,D=db,u=user,p=pw,t=table 
h=host2,D=db2,u=user,p=pw

In the following:
---
sub get_slave_status {
   my ( $self, $dbh ) = @_;
   if ( !$self->{not_a_slave}->{$dbh} ) {
      my $sth = $self->{sths}->{$dbh}->{SLAVE_STATUS}
            ||= $dbh->prepare('SHOW SLAVE STATUS');
      MKDEBUG && _d($dbh, 'SHOW SLAVE STATUS');
      $sth->execute();
      my ($ss) = @{$sth->fetchall_arrayref({})};

      if ( $ss && %$ss ) {
         $ss = { map { lc($_) => $ss->{$_} } keys %$ss }; # lowercase the keys
         return $ss;
      }

      MKDEBUG && _d('This server returns nothing for SHOW SLAVE STATUS');
      $self->{not_a_slave}->{$dbh}++;
   }
}
---

It should be treated the same way.  If you're doing a simple table sync, it 
should just sync the table.

By commenting out the line for the SHOW SLAVE STATUS and the one for the binlog 
format as previously mentioned, it now runs perfectly on syncing a table 
between two standalone databases...

Original comment by ph...@rogers.com on 4 Apr 2011 at 2:53

GoogleCodeExporter commented 9 years ago
Nothing further was done on this?  I was going to use mk-table-sync to do some 
simple non-replication syncing to a database in a CPanel shared-hosting 
environment, where replication and SUPER privileges are not possible.

I'm not a Perl programmer, but I commented out the lines between the two 
MKDEBUGs in get_slave_status, and the SET @@binlog_format execution line in 
get_cxn, and a simple sync seemed to work.  But it would be nice if this were 
baked in... :)

Original comment by NorthernRavenNWT on 19 Aug 2011 at 10:11

GoogleCodeExporter commented 9 years ago
Work around for getting a table sync issue when super privileges are not 
available and not syncing between a master and a slave.  This code change adds 
an option called "--no-bin-log-format" to tell pt-table-sync that is is okay to 
avoid using "SET @@binlog_format=STATEMENT" .

I added the following code to the pt-table-sync (yes I know that this is a 
maatkit forum) - the link in the code references this forum, so maybe this will 
help someone.

This addresses only one of many possible places where the super privilege is 
needed

In pt-table-sync there is original code looking like this:

sub get_cxn {
...
   # Ensure statement-based replication.                                                                                                 
   # http://code.google.com/p/maatkit/issues/detail?id=95                                                                                
   $sql = '/*!50105 SET @@binlog_format="STATEMENT"*/';
   PTDEBUG && _d($dbh, $sql);
   $dbh->do($sql);

The changed code is:

   # Ensure statement-based replication.                                                                                                 
   # http://code.google.com/p/maatkit/issues/detail?id=95                                                                                
   if ( $o->get('bin-log-format') ) {
       $sql = '/*!50105 SET @@binlog_format="STATEMENT"*/';
       PTDEBUG && _d($dbh, $sql);
       $dbh->do($sql);
   }

which requires that the bin-log-format option be added to the POD section,  So 
I added this:

original:

=item --[no]bin-log                                                             

default: yes                                                                    

Log to the binary log (C<SET SQL_LOG_BIN=1>).                                   

Specifying C<--no-bin-log> will C<SET SQL_LOG_BIN=0>.                           

=item --buffer-in-mysql                                                         

Changed code:

=item --[no]bin-log                                                             

default: yes                                                                    

Log to the binary log (C<SET SQL_LOG_BIN=1>).                                   

Specifying C<--no-bin-log> will C<SET SQL_LOG_BIN=0>.                           

=item --[no]bin-log-format                                                      

default: yes                                                                    

Force binlog_format to STATEMENT (C<SET @@binlog_format=STATEMENT>).            

Specifying C<--no-bin-log> will prevent binlog_format modification.             

=item --buffer-in-mysql                                                         

I'll be sending this change to the percona folks when I figure out their build 
environment and create a test case.  Depending on their workload, merit of the 
change, and quality of the change, they may or may not include it.

Original comment by mark.leh...@gmail.com on 25 Apr 2012 at 5:01