jiangty-addepar / common-schema

Automatically exported from code.google.com/p/common-schema
0 stars 0 forks source link

Here are some stored procedures and a config table for pinning tables/indexes in the buffer pool (useful as an event) #44

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
use common_schema;

CREATE TABLE IF NOT EXISTS `pin_buffer_pool_config` (
  `schema_name` varchar(50) DEFAULT NULL,
  `table_name` varchar(50) DEFAULT NULL,
  `index_name` varchar(255) DEFAULT 'PRIMARY' comment 'null for all indexes',
  `where_clause` varchar(255) DEFAULT NULL comment 'do not include the WHERE keyword'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

delimiter ;;

DROP PROCEDURE IF EXISTS pin_buffer_pool;

-- load all the columns for the given index
-- into the buffer pool
CREATE PROCEDURE pin_buffer_pool()
BEGIN
DECLARE v_done boolean default false;
DECLARE v_stmt TEXT default null;
DECLARE v_got_lock tinyint default 0;
DECLARE v_cursor CURSOR FOR
SELECT CONCAT('SELECT COUNT(CONCAT(', 
              GROUP_CONCAT(column_name ORDER BY seq_in_index), ')) INTO @discard FROM `', 
              s.table_schema, '`.`', s.table_name,
              '` FORCE INDEX(`', s.index_name, '`)', 
              IF(where_clause IS NOT NULL, CONCAT(' WHERE ', where_clause), '')
         ) AS stmt 
   FROM information_schema.statistics s 
   JOIN test.pin_buffer_pool_config pbpc
     ON pbpc.table_name = s.table_name 
    AND pbpc.schema_name = s.table_schema 
    -- when the index_name is null, it means warm all indexes for the table
    AND s.index_name = ifnull(pbpc.index_name, s.index_name) 
  GROUP BY s.index_name;

  DECLARE CONTINUE HANDLER FOR
  SQLSTATE '02000'
    SET v_done = TRUE;

  -- DON'T HOLD A LONG TRANSACTION, START A NEW SNAPSHOT FOR EACH READ
  SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

  -- abort if the last pin is still running
  SELECT GET_LOCK('BUFFER_POOL_PIN', 0) INTO v_got_lock;
  IF v_got_lock = 1 THEN

    OPEN v_cursor;
    cursorLoop: LOOP
      FETCH v_cursor INTO v_stmt;

      IF v_done THEN
        CLOSE v_cursor;
        LEAVE cursorLoop;
      END IF;

      set @v_stmt = v_stmt;
      prepare v_bp_pin from @v_stmt;
      execute v_bp_pin;
      deallocate prepare v_bp_pin;

    END LOOP;

    SELECT RELEASE_LOCK('BUFFER_POOL_PIN') INTO @discard;

  END IF;

END;;

DROP PROCEDURE IF EXISTS pin_buffer_pool_loop;;

CREATE PROCEDURE pin_buffer_pool_loop()
BEGIN
  -- This procedure can be scheduled to start
  -- every second with no harm done.  It will
  -- simply exit if more than one copy tries
  -- to run.  This means that an event can
  -- be used to ensure the warming function is
  -- always on and that it is looping faster
  -- than innodb_old_blocks_time (default 1000 in 5.6)
  SELECT GET_LOCK('BUFFER_POOL_PIN_LOOP_LOCK', 0) INTO @got_lock;
  IF @got_lock = 1 THEN
    LOOP
      CALL pin_buffer_pool();
      select sleep(.25) into @discard;
    END LOOP;
    SELECT RELEASE_LOCK('BUFFER_POOL_PIN_LOOP_LOCK') INTO @discard;
  END IF;
END;;

delimiter ;

Original issue reported on code.google.com by greenlion@gmail.com on 7 May 2013 at 4:23

GoogleCodeExporter commented 9 years ago
replace test\.  with common_schema\.  I think there is only one place (in the 
JOIN) that this is broken. 

Original comment by greenlion@gmail.com on 7 May 2013 at 4:24

GoogleCodeExporter commented 9 years ago
Thank you.
One problematic issue is that the `pin_buffer_pool_config` table is stateful, 
whereas common_schema is stateless; that is, when you install a newer version 
of common_schema, all data is being rebuilt, and you shouldn't care.
I suggest providing name of pin-table to the routine.

Original comment by shlomi.n...@gmail.com on 7 May 2013 at 5:41

GoogleCodeExporter commented 9 years ago

Original comment by shlomi.n...@gmail.com on 7 May 2013 at 5:42

GoogleCodeExporter commented 9 years ago
Further note is to iterate the statistics table on a per table basis, or else 
this is a major impact on table descriptors cache. The join from the 
pin_buffer_pool_config does not make for information_schema optimizations here.

Original comment by shlomi.n...@gmail.com on 7 May 2013 at 7:47

GoogleCodeExporter commented 9 years ago
Yet additional note is that eval() is a perfect candidate for working out the 
queries as generated by cursor.

Original comment by shlomi.n...@gmail.com on 7 May 2013 at 7:49

GoogleCodeExporter commented 9 years ago
I wasn't sure how to use the common schema iterator functions with dynamic sql, 
ie:
foreach($foo, $bar : select * from $my_table) {

}

Original comment by greenlion@gmail.com on 7 May 2013 at 8:52

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Oh, duh, I can just create script and user the table name directly in the 
script.  I'm a moron sometimes :)

Original comment by greenlion@gmail.com on 7 May 2013 at 5:50

GoogleCodeExporter commented 9 years ago
And yet sometimes you are very bright :)

Original comment by shlomi.n...@gmail.com on 7 May 2013 at 6:05

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
use common_schema;

delimiter ;;

DROP PROCEDURE IF EXISTS pin_buffer_pool;

CREATE PROCEDURE pin_buffer_pool(IN v_config_schema tinytext charset utf8, IN 
v_config_table tinytext charset utf8)
BEGIN

SET @_query_script_input_col1 := CONCAT('`' , replace(v_config_schema, 
'`','``'), '`.`', replace(v_config_table,'`', '``') , '`');

set @v_exec_sql = "";

set @script := CONCAT('
input $config_table;
SET @v_exec_sql := "";
foreach($db, $tbl, $idx, $whereclause: select schema_name, table_name, 
index_name, where_clause from :$config_table) {
        foreach($idx_name, $idx_columns: SELECT index_name, group_concat(column_name order by seq_in_index) from information_schema.statistics where table_schema = $db and table_name = $tbl and index_name = ifnull($idx, index_name) group by index_name) {

                if($idx IS NULL OR $idx = $idx_name) {
                        SET @v_exec_sql := CONCAT(@v_exec_sql, "SELECT COUNT(CONCAT(", $idx_columns, ")) INTO @discard from ", $db, ".", $tbl, " FORCE INDEX(`", $idx_name, "`) " );
                        IF($whereclause IS NOT NULL) {
                                SET @v_exec_sql := CONCAT(@v_exec_sql, " WHERE ", $whereclause, ";");
                        } else {   
                                SET @v_exec_sql := CONCAT(@v_exec_sql, ";");
                        }
                }
        }

}
');
call common_schema.run(@script);
call common_schema.exec(@v_exec_sql);

END;;

DROP PROCEDURE IF EXISTS pin_buffer_pool_loop;;

CREATE PROCEDURE pin_buffer_pool_loop(IN v_config_schema tinytext charset utf8, 
IN v_config_table tinytext charset utf8)
BEGIN
  -- This procedure can be scheduled to start
  -- every second with no harm done.  It will
  -- simply exit if more than one copy tries
  -- to run.  This means that an event can
  -- be used to ensure the warming function is
  -- always on and that it is looping faster
  -- than innodb_old_blocks_time (default 1000 in 5.6)
  SELECT GET_LOCK('BUFFER_POOL_PIN_LOOP_LOCK', 0) INTO @got_lock;
  IF @got_lock = 1 THEN
    LOOP
      CALL pin_buffer_pool(v_config_schema, v_config_table);
      select sleep(.25) into @discard;
    END LOOP;
    SELECT RELEASE_LOCK('BUFFER_POOL_PIN_LOOP_LOCK') INTO @discard;
  END IF;
END;;

delimiter ;

Original comment by greenlion@gmail.com on 7 May 2013 at 7:51

GoogleCodeExporter commented 9 years ago
I did have some challenges when making the script.  The error messages are a 
little unforgiving and it was really hard to track down some of the minor 
syntactical mistakes that I was making.  For example, I forgot a := and had 
"var $something = " at the top of my script.  This caused an error further down 
about a variable name being too long which I'll now recognize as forgetting a 
:= :)

Original comment by greenlion@gmail.com on 7 May 2013 at 8:05

GoogleCodeExporter commented 9 years ago
And I forgot a { (or had an extra one) and the message about negative depth was 
confusing.

Original comment by greenlion@gmail.com on 7 May 2013 at 8:06

GoogleCodeExporter commented 9 years ago
Using QueryScript here is cool. I think a mere call to eval() routine would do 
just fine as well.

Original comment by shlomi.n...@gmail.com on 9 May 2013 at 6:39

GoogleCodeExporter commented 9 years ago
I'm not sure how to do a single call to eval without joining to the config 
table which you said is bad for the I_S.  I must be missing something.

I pretty much just wanted to learn QS, at least to start learning it as I'm 
sure I just scratched the surface (I'm going to blog about it on MPB though 
because it is awesome).  The second stored proc is way faster than the 
queryscript.  I assume the eval() will be too?

Original comment by greenlion@gmail.com on 9 May 2013 at 7:30

GoogleCodeExporter commented 9 years ago
You are correct; my note on using eval() only works if you join onto the 
STATISTICS table.

We DON'T want that, but here's how it would look like:

CREATE PROCEDURE pin_buffer_pool()
BEGIN
DECLARE v_got_lock tinyint default 0;
SET @query:= "SELECT CONCAT('SELECT COUNT(CONCAT(', 
              GROUP_CONCAT(column_name ORDER BY seq_in_index), ')) INTO @discard FROM `', 
              s.table_schema, '`.`', s.table_name,
              '` FORCE INDEX(`', s.index_name, '`)', 
              IF(where_clause IS NOT NULL, CONCAT(' WHERE ', where_clause), '')
         ) AS stmt 
   FROM information_schema.statistics s 
   JOIN test.pin_buffer_pool_config pbpc
     ON pbpc.table_name = s.table_name 
    AND pbpc.schema_name = s.table_schema 
    AND s.index_name = ifnull(pbpc.index_name, s.index_name) 
  GROUP BY s.index_name";

  -- DON'T HOLD A LONG TRANSACTION, START A NEW SNAPSHOT FOR EACH READ
  SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

  -- abort if the last pin is still running
  SELECT GET_LOCK('BUFFER_POOL_PIN', 0) INTO v_got_lock;
  IF v_got_lock = 1 THEN
    call eval(@query);
    SELECT RELEASE_LOCK('BUFFER_POOL_PIN') INTO @discard;

  END IF;

END;;

Original comment by shlomi.n...@gmail.com on 9 May 2013 at 9:16

GoogleCodeExporter commented 9 years ago
Your use of QueryScript is just fine. The reason it is slower is, of course, 
because of the need to parse your text, "compile it", create a bunch of 
temporary tables, interpret (each foreach statement in itself uses a temporary 
table; no big deal for this kind of an operation).

A couple shortcuts:

- instead of 
CONCAT('`' , replace(v_config_schema, '`','``'), '`.`', 
replace(v_config_table,'`', '``') , '`');

you can:
CONCAT(mysql_qualify(v_config_schema), '.', mysql_qualify(v_config_table));

See
http://common-schema.googlecode.com/svn/trunk/common_schema/doc/html/mysql_quali
fy.html

- "select sleep(.25) into @discard;" can rewrite to "DO sleep(0.25);". Nitpick, 
I know.

So you're running the script every .25 seconds; I assume this is because you're 
expecting the content on the pin table to change; otherwise this is a waste of 
script interpretation. I'm just curious: if you're offering this code that 
means you happen to use it. Do you actually need to pin down tables every 0.25 
second in order for them to stay pinned?

I'm happy someone other than myself is finding use to QueryScript! I'm like 
doing this every day and am very happy with it.

Original comment by shlomi.n...@gmail.com on 9 May 2013 at 9:23

GoogleCodeExporter commented 9 years ago
Well, while we're at it:

- exec() accepts TEXT, so no longer than 65536; shouldn't be much of a problem 
unless with tables with hundreds of columns; in such case may want to avoid 
CONCATentating all queries, and exec them one by one.

- The use of $input is basically for more automated calls onto QueryScript. In 
your case, you could have simply:

SET @fully_qualified_table_name := CONCAT(mysql_qualify(v_config_schema), '.', 
mysql_qualify(v_config_table));

and within the QueryScript code, replace:
input $config_table;
With
var $config_table := @fully_qualified_table_name;

More readable IMHO.

OK, enough lecturing! Thanks

Original comment by shlomi.n...@gmail.com on 9 May 2013 at 9:56

GoogleCodeExporter commented 9 years ago
If you take a look at my notes on the star schema benchmark in 5.6 you'll 
notice I ran into issues with innodb_old_blocks_time=1000 doing low concurrency.

Basically bringing in the rows for the index on the fact table, plus scanning 
the fact table by itself pushes out the pages for the dimension tables and they 
keep getting pulled into the buffer pool again and again and again.   

This is to take some CPU hit and make sure that the dimension tables always 
stay in the buffer pool by reading them more frequently than 
innodb_old_blocks_time which defaults to 1000.

Original comment by justin.s...@percona.com on 12 May 2013 at 9:28

GoogleCodeExporter commented 9 years ago
Some of the particulars, including using the input variable were to familiarize 
myself with queryscript.  Thanks for the pointers.  I'll fix it up and make a 
blog post about it comparing the performance using it versus not using it.

Original comment by justin.s...@percona.com on 12 May 2013 at 10:02

GoogleCodeExporter commented 9 years ago
Performance of queries, not performance of QueryScript itself.  I'm not too 
worried about that.  

Original comment by justin.s...@percona.com on 12 May 2013 at 10:04