hugowan / maatkit

Automatically exported from code.google.com/p/maatkit
0 stars 0 forks source link

mk-table-checksum should push down table filters as early as possible #99

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Suppose table X has an error, like

DBD::mysql::db selectrow_hashref failed: Unable to open underlying table
which is differently defined or of non-MyISAM type or doesn't exist at
mk-table-checksum line 1550.

And now suppose you find out which table this is, and you say --ignoretbl
foo.bar, then if you run with MKDEBUG=1 you'll see

# MySQLDump:1549 8176 SHOW CREATE TABLE `foo`.`bar`

This needs to be trimmed out before we get to this point.

Original issue reported on code.google.com by baron.schwartz on 16 Oct 2008 at 2:48

GoogleCodeExporter commented 9 years ago
The attached patch prevents mk-table-checksum from doing all the useless "SHOW 
CREATE
TABLE"s.

myhost:~# MKDEBUG=1 ./mk-table-checksum h=localhost --database mydb --table 
mytable >
mk-table-checksum.log
myhost:~# grep 'SHOW CREATE' mk-table-checksum.log | wc -l
1570

myhost:~# #MKDEBUG=1 ./mk-table-checksum_patched h=localhost --database mydb 
--table
mytable > mk-table-checksum.log_patched
myhost:~# grep 'SHOW CREATE' mk-table-checksum.log_patched | wc -l
2

Original comment by m...@pascalhofmann.de on 22 Oct 2008 at 1:16

Attachments:

GoogleCodeExporter commented 9 years ago
(The database which was used had 1569 tables.)

Original comment by m...@pascalhofmann.de on 22 Oct 2008 at 10:16

GoogleCodeExporter commented 9 years ago
This and issue 23 are related. I'm working on these issues.

Thank you for the patch.

Original comment by dan...@percona.com on 31 Oct 2008 at 12:36

GoogleCodeExporter commented 9 years ago
Yeah, this issue is going to be a bit complex. Normally, MySQLFind would not 
cause
useless SHOW CREATE TABLEs. At first it just calls MySQLDumper::get_table_list()
which is essentially just SHOW TABLES. Then it checks if we need to know each 
tables'
storage engine. If yes, then we SHOW CREATE TABLE on every table; if no, then 
we don't.

mk-table-checksum, for example, ignores engines FEDERATED,MRG_MyISAM by default;
therefore, MySQLFind will cause a SHOW CREATE TABLE on every table before it 
later
filters those tables.

In general the work done by find_tables() is: get all tables, get all tables' 
engine
info if needed, filter tables by name, filter tables by engine.

So, let's see... first, we need to re-think _fetch_tbl_list(). This seems more
appropriate to MySQLDump. Also, it does more than its name implies: in addition 
to
returning a list of tables, it may also cause our problematic SHOW CREATE TABLE 
on
all those tables, in order to get their engines. Personally, I don't like when 
sub do
unexpected magick inside which comes back to cause us problems. So, I think 
this sub
needs to be moved to MySQLDump and made to do a singular function: return an 
array of
db.tbl names, pure and simple. We'll still need a _fetch_tbl_list of sorts in
MySQLFind however, since it sometimes gets tables and sometimes get tables' 
statuses.

Second, in MySQLFind::find_tables(), we'll filter first by table name, as we are
already doing. Then, we'll filter by engine and, in so doing, we'll get engines 
only
for the tables that survived the name filter.

Unless anyone has major objections, I'll move the code in this direction, 
probably
Saturday. :-)

Original comment by dan...@percona.com on 31 Oct 2008 at 1:47

GoogleCodeExporter commented 9 years ago
I'm still working on this. I've got a test in place which currently fails. I'm 
also
looking at another bug-like effect inside the code where the
$finder->{tables}->{status} array becomes auto-created (Perl magick) and 
therefore
causes the module to start needlessly getting table statuses. 

Original comment by dan...@percona.com on 4 Nov 2008 at 5:02

GoogleCodeExporter commented 9 years ago
r2475 has the updated MySQLFind. r2476 is mk-table-sync with the updated 
MySQLFind
and it is passing its tests. I updated mk-table-checksum but it's failing many 
of its
tests, but the reason for that is probably not related to the new MySQL find; 
see my
comment 2 on issue 121.

Original comment by dan...@percona.com on 5 Nov 2008 at 4:39

GoogleCodeExporter commented 9 years ago
r2479 has the fully updated and functional mk-table-checksum with the more 
efficient
MySQLFind. This issue should be fixed.

Original comment by dan...@percona.com on 8 Nov 2008 at 4:00

GoogleCodeExporter commented 9 years ago
Before I close this issue I will speak briefly of the changes made to MySQLFind.

First, need_engine is figured out once in new(). This shouldn't change during 
the
life of an instantiation. Furthermore, we need also to know it in find_tables() 
in
order to avoid the unnecessary SHOW CREATE TABLEs which are now only executed 
for
engine-related criteria.

Views are now filtered out sooner in _fetch_table_list() rather than after the 
main
work for find_tables().

_fetch_table_list() had 2 subtle bugs:
1) The sub was supposed to save the current db at entry then restore it when 
done,
but it never did this because it returned early in its if-else tree before the 
call
to restore the db at the very end of the sub.
2) The more important bug began in find_tables() with:
foreach my $crit ( @{$self->{tables}->{status}} ) {
Innocuous yes? No sir. By itself, that only caused Perl to automagically create 
an
empty anonymous array as the key to status (also automagically creating status 
if it
did not exist in the first place). For the first call to find_tables(), the bug 
did
not manifest, but only in the 2nd and subsequent calls because back in
_fetch_table_list() we had:
my $need_status = $self->{tables}->{status};
if ( $need_status || ($need_engine && !$self->{useddl}) ) {
Because Perl automagically created that empty anon array, $need_status 
subsequently
began to incorrectly evaluate as true, thereby causing us to incorrectly pass 
this if
test each and every time, for every table thereafter. Usually this did not 
cause a
problem, but it was a bit of black magick that could have come back to haunt us
later. It has been fixed.

Finally, the core issue was fixed: tables are found and filtered first. Then, 
SHOW
CREATE TABLE is gotten for the remaining tables only if there are engine-based
filters, and then only if the table's engine had not been gotten earlier (which
sometimes happens). The end result: SHOW CREATE TABLE is ran only if absolutely
necessary.

Original comment by dan...@percona.com on 11 Nov 2008 at 3:33

GoogleCodeExporter commented 9 years ago
See the r2475 diff for all the changes.

Original comment by dan...@percona.com on 11 Nov 2008 at 3:35