Tuning Advisor Adjustments

gordan-bobic commented 3 months ago

innodb_redo_log_capacity/innodb_log_file_size The graph seems to be wrong. It should be the same as MySQL / MySQL InnoDB Metrics / InnoDB Checkpoint Age
innodb_read_io_threads, innodb_write_io_threads There could always be the odd write pending, so increasing threads may be premature. Increase the threshold for recommending an increase to (current io_threads / 2), i.e. if we have 4 threads, don't recommend an increase unless we see at least 2 pending I/O s on either reads or writes respectively. Otherwise we will always end up recommending increasing it up to the maximum (64). Additionally - we should never recommend more than half of the number of CPU threads we detect, e.g. on a 16 CPU thread server we should never recommend more than 8 threads each for innodb_read_io_threads or innodb_write_io_threads.

innodb_buffer_pool_size Something is wrong with the graph. In my test case the graph seems to be showing 1.5MB, but querying the information_schema directly, I get a much larger number:

MariaDB [information_schema]> select sum(index_length)/1024/1024 from tables where engine = 'InnoDB';
+-----------------------------+
| sum(index_length)/1024/1024 |
+-----------------------------+
|                 69.09375000 |
+-----------------------------+

oblitorum commented 3 months ago

innodb_buffer_pool_size Something is wrong with the graph. In my test case the graph seems to be showing 1.5MB, but querying the information_schema directly, I get a much larger number:

Can you try excluding schema mysql, performance_schema, information_schema , mysqld_exporter ingores those schemas by default (it's inherited from upstream), unless you include them explicitly with collect.info_schema.tables.databases = mysql,performance_schema,information_schema,...

gordan-bobic commented 3 months ago

That doesn't seem to be it:

MariaDB [information_schema]> select sum(index_length)/1024/1024 from tables where engine = 'InnoDB' and table_schema not in ('mysql', 'performance_schema', 'information_schema');
+-----------------------------+
| sum(index_length)/1024/1024 |
+-----------------------------+
|                 69.04687500 |
+-----------------------------+

oblitorum commented 2 months ago

Something is wrong with the graph. In my test case the graph seems to be showing 1.5MB, but querying the information_schema directly, I get a much larger number: MariaDB [information_schema]> select sum(index_length)/1024/1024 from tables where engine = 'InnoDB';
MariaDB [information_schema]> select sum(index_length)/1024/1024 from tables where engine = 'InnoDB';
+-----------------------------+
| sum(index_length)/1024/1024 |
+-----------------------------+
|                 69.09375000 |
+-----------------------------+
Another possible cause I can think of is that there are same TABLE_SCHEMA+TABLE_NAME rows with different catalogs. e.g.

TABLE_CATALOG	TABLE_SCHEMA	TABLE_NAME	...
def	test	sbtest2	...
catalog_2	test	sbtest2	...

And mysqld_exporter populates these rows to prometheus without regard to the catalog column, so prometheus treats them as duplicate rows and keeps only one of them, hence there is a size difference.

Could it be the case on that monitored server?

gordan-bobic commented 2 months ago

I don't think so, unless sysbench creates catalogs by default. The only data / database is the sysbench generated test set.

oblitorum commented 2 months ago

Then the only possible thing I can think of is that there was something wrong that caused mysqld_exporter to populate only part of them. e.g. I named a database 'test (with a single quote) on purpose and used sysbench to generate data, this caused some errors from mysqld_exporter:

ts=2024-08-08T07:37:46.123Z caller=tls_config.go:310 level=info msg="TLS is enabled." http2=true address=[::]:42002
ts=2024-08-08T07:42:09.740Z caller=exporter.go:172 level=error msg="Error from scraper" scraper=info_schema.tables target=/run/mysqld/mysqld.sock err="Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'test'' at line 14"
ts=2024-08-08T07:43:09.735Z caller=exporter.go:172 level=error msg="Error from scraper" scraper=info_schema.tables target=/run/mysqld/mysqld.sock err="Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'test'' at line 14"
ts=2024-08-08T07:44:09.737Z caller=exporter.go:172 level=error msg="Error from scraper" scraper=info_schema.tables target=/run/mysqld/mysqld.sock err="Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'test'' at line 14"

Because the code (upstream) uses single quotes around the database name -> https://github.com/prometheus/mysqld_exporter/blob/main/collector/info_schema_tables.go#L31-L44 ,

        SELECT
            TABLE_SCHEMA,
            TABLE_NAME,
            TABLE_TYPE,
            ifnull(ENGINE, 'NONE') as ENGINE,
            ifnull(VERSION, '0') as VERSION,
            ifnull(ROW_FORMAT, 'NONE') as ROW_FORMAT,
            ifnull(TABLE_ROWS, '0') as TABLE_ROWS,
            ifnull(DATA_LENGTH, '0') as DATA_LENGTH,
            ifnull(INDEX_LENGTH, '0') as INDEX_LENGTH,
            ifnull(DATA_FREE, '0') as DATA_FREE,
            ifnull(CREATE_OPTIONS, 'NONE') as CREATE_OPTIONS
          FROM information_schema.tables
          WHERE TABLE_SCHEMA = '%s'

And this does produce the situation where the index_length size from graph is smaller than querying the size directly, because those index_length rows are only partially populated:

MariaDB [information_schema]> select sum(index_length)/1024/1024 from tables where engine = 'InnoDB' and table_schema not in ('mysql', 'performance_schema', 'information_schema');
+-----------------------------+
| sum(index_length)/1024/1024 |
+-----------------------------+
|                 72.57812500 |
+-----------------------------+
1 row in set (0.002 sec)

But I reckon you wouldn't use an unregular database name when you used sysbench to generate data, right? So it could be another case on your side, which I haven't found out yet. Do you still have that server running? Can you see any errors in the mysqld_exporter log file?

gordan-bobic commented 2 months ago

No weird database names that would require quoting. Yes, I have that test VM, I'll dump the database from it and we can try to reproduce it somewhere else.

oblitorum commented 2 months ago

OK, cool

gordan-bobic commented 2 months ago

In a twist I'm sure you saw coming - I just restarted the whole thing for another check, "just to make sure" - and the graph is now showing the index size correctly, with MariaDB 11.4. I have no idea what was wrong before, but I guess it can't be fixed if it can't be reproduced.

Can you please change the labels from "Index Length Size" to "Index Size"?

For table_definition_cache, can you please adjust the equation so that if count(tables) is <= 400, advise 400 (default)? Beyond that, use the current equation (round up to next power of 2).

oblitorum commented 2 months ago

OK, no problem

oblitorum commented 2 months ago

For table_definition_cache proposed value, I thought I already limited the minimal value to 400. Was it suggesting a value below 400 on your side?

gordan-bobic commented 2 months ago

No, it was suggesting 512 when the number of tables is 260 with the current value of 400. In that case it should be suggesting sticking with 400.

oblitorum commented 2 months ago

Oh, I see what you mean, will adjust it.

shatteredsilicon / ssm-submodules

Tuning Advisor Adjustments #289