shatteredsilicon / ssm-submodules

GNU Affero General Public License v3.0
1 stars 2 forks source link

Add measurement / dashboard / graphs for InnoDB I/O Thread Queue Length #105

Closed gordan-bobic closed 1 year ago

gordan-bobic commented 1 year ago

In the output of

SHOW ENGINE InnoDB STATUS\G

there is a line that looks like this:

Pending normal aio reads: [2, 3, 1, 2] , aio writes: [1, 0, 1, 0] ,

This is useful for tunning the following settings:

innodb_read_io_threads
innodb_write_io_threads

If the pending reads or writes spend a significant amount of time above 0 and the disk I/O is not fully saturated, it means we need more I/O threads.

There is currently no graph for this in SSM.

Add a graph called "InnoDB I/O Thread Utilisation" on the InnoDB Metrics. One line per I/O thread (both read and write threads should be displayed), stacked on top of each other, representing the pending aio reads and pending aio writes.

On the right Y axis, we should have the same data as on the graph of "Disk I/O Utilization" as exists on the "Disk Performance" dashboard. This is because if the InnoDB Pending I/O is significantly above 0 but the disk I/O saturation is not maxed out, it means we need more I/O threads, and it is useful to see this on a single graph.

oblitorum commented 1 year ago

Does this looks OK?

image

Read/Write threads are named with index appended at the end

gordan-bobic commented 1 year ago

Yes, that looks good. :-)

gordan-bobic commented 1 year ago

Just to make sure - the above output:

Pending normal aio reads: [2, 3, 1, 2] , aio writes: [1, 0, 1, 0] ,

only looks like that when innodb_use_native_aio is enabled. Unfortunately, this should be disabled when using ZFS.

Without native aio enabled, the output seems to appear as follows:

--------
FILE I/O
--------
Pending flushes (fsync) log: 0; buffer pool: 0
507161 OS file reads, 258048 OS file writes, 194491 OS fsyncs
1 pending reads, 0 pending writes
3.33 reads/s, 9830 avg bytes/read, 900.37 writes/s, 34.32 fsyncs/s

I also notice that it doesn't aways show up in SHOW ENGINE INNODB STATUS\G output. Not much we can do about that, but be aware that the exporter will not find it on most scraping passes.

I think if it returns no output it means there are no pending reads or writes.

The above pasting is from MariaDB 10.5. Interestingly, the original format is returned from 10.3 even with aio disabled.

Just wanted to make sure that all of the main cases is covered because the output that has to be parsed does differ depending on configuration and version.

oblitorum commented 1 year ago

Thanks for clarifying this, you were right, these cases are not covered, I'll add some patches to cover these cases.

gordan-bobic commented 1 year ago

I think my interpretation about the pending reads when aio stats isn't listed was incomplete. You'll need to verify whether on newer versions aio queue length for threads is listed at all (even with 0s) with innodb_use_native_aio disabled. On 10.6, it doesn't seem to be. On 10.3 and 10.4, it is shown regardless.

When aio is disabled, this also seems relevant (from 10.3, also matches 10.4):

# mysql -e 'show engine innodb status\G' | grep -i 'i/o thread'
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (read thread)
I/O thread 4 state: waiting for i/o request (read thread)
I/O thread 5 state: waiting for i/o request (read thread)
I/O thread 6 state: waiting for i/o request (write thread)
I/O thread 7 state: waiting for i/o request (write thread)
I/O thread 8 state: waiting for i/o request (write thread)
I/O thread 9 state: waiting for i/o request (write thread)

On a 10.4 server I just looked at (native aio disabled)

# mysql -e 'show engine innodb status\G' | grep -i pend 
Pending normal aio reads: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , aio writes: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] ,
Pending flushes (fsync) log: 0; buffer pool: 0
0 pending log flushes, 0 pending chkp writes
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0

Interestingly, I never get more than 8 threads listed for pending reads/writes, even when I have innodb_read_io_threads and innodb_write_io_threads turned all the way up to 64.

I'm struggling to generate enough load to make the numbers tick up from 0.

Will provide more examples when I can.

oblitorum commented 1 year ago

OK, I was doing some benchmark tests to generate enough load on mariadb 10.4, and continually running SHOW ENGINE ... to monitor the pending reads/writes output , and then I saw this

$ mysql -e "show engine innodb status \G" | grep -i pending
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [79, 8, 3, 24] ,
Pending flushes (fsync) log: 1; buffer pool: 0
0 pending log flushes, 1 pending chkp writes
...

[...]

$ mysql -e "show engine innodb status \G" | grep -i pending
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
Pending flushes (fsync) log: 0; buffer pool: 0
0 pending reads, 1 pending writes
1 pending log flushes, 0 pending chkp writes
...

Pending normal aio reads ... and xxx pending reads ... could appear at the same time! I think we were on the wrong direction, they are actually separated logs. I think Pending normal aio reads ... represents those I/O operations requested through AIO, and xxx pending reads ... represents those I/O operations requested through InnoDB background threads.

I think what we should do here is to sum them up, but I'm not very sure.

And they removed those AIO logs from mariadb 10.5+, with these changes:

https://github.com/MariaDB/server/commit/d8373fea5f099b694f0ab14e0d083388915de51c#diff-9ad78a90c176f897bb2dc367c152ec687ff55cc92dbe7a0c364845af536e23e1L4163-L4171

https://github.com/MariaDB/server/commit/dd07cfcecd4aabb0aeae9c4f5087f82b4080c1bd

That's why those AIO logs were not shown when you were testing it, which confused us. So for mariadb 10.5+, there is only xxx pending reads, xxx pending writes this kind of output. And I feel like something has changed internally, and this state actually includes 2 kinds of I/O operations.

gordan-bobic commented 1 year ago

Excellent bit of detective work to find where it got removed. Kudos.

So for aio, it's pretty obvious what we should do (not that for higher numbers of threads, the array will be bigger, 4 is the default, if I remember correctly).

Those non-aio pending reads/writes should probably be on a separate graph.

I am not sure if we can combine these two onto the same graph. If they are mutually exclusive, I guess we should show whichever of the two we have. But are there cases where we get both, and one set is not all 0? If that can happen, then these are different things and we should keep them separate.

The final bit, non-threaded display of pending reads, writes, log flushes, and checkpoint writes should be separate entries, probably on a 3rd separate graph.

The main downside I see is that we can end up with some graphs that are effectively dead on <= 10.4 and some graphs that are dead on >= 10.5. But I guess I can look at whether that happens while we are in the pre-release phase when I can deploy it on a few installations and see what I get back on different versions.

oblitorum commented 1 year ago

But are there cases where we get both, and one set is not all 0? If that can happen, then these are different things and we should keep them separate.

I just tested a few rounds, and never got both.

The final bit, non-threaded display of pending reads, writes, log flushes, and checkpoint writes should be separate entries, probably on a 3rd separate graph.

You mean add another dashboard that shows pending reads, pending writes, log flushes, checkpoint writes as separate lines?

gordan-bobic commented 1 year ago

I just tested a few rounds, and never got both.

OK, great. In that case we can use the same graph. As you suggested sum(type1+type2), so whichever we get, gets on the graph.

You mean add another dashboard that shows pending reads, pending writes, log flushes, checkpoint writes as separate lines?

Yes.

oblitorum commented 1 year ago

The final bit, non-threaded display of pending reads, writes, log flushes, and checkpoint writes should be separate entries, probably on a 3rd separate graph.

There is a problem with fetching pending log flushes and checkpoint writes -> MariaDB removed these 2 counters from the innodb status output on version 10.8+

pending log flushes was removed in this commit -> https://github.com/MariaDB/server/commit/685d958e38b825ad9829be311f26729cccf37c46#diff-3ba3b7d831068d7f11b598c75a5b79afac7d6e3947b3f6e4d530de67964a3612L1310-L1313 In the commit message of this commit, there are lines:

The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed:

os_log_fsyncs os_log_pending_fsyncs log_pending_log_flushes log_pending_checkpoint_writes

pending checkpoint writes was removed in this commit -> https://github.com/MariaDB/server/commit/934b2d605ec15322f2a5052ebb90e116bce8aaaa#diff-3ba3b7d831068d7f11b598c75a5b79afac7d6e3947b3f6e4d530de67964a3612L1039-L1042

Seems that they removed them because those counter had lost their meaning after some implementation changed. Should we just show them as 0 on mariadb 10.8+?

gordan-bobic commented 1 year ago

I guess we should just show them as "no data" if the data is not actually available. This is a little concerning because it seems there is no longer anything to base inferring of the optimal settings for

innodb_read_io_threads
innodb_write_io_threads

In the absence of any data, on the tuning adviser dashboard we should recommend the greater of 4 (default) and 1/4 of available CPU threads, up to a maximum of 64, as a reasonable heuristic.

oblitorum commented 1 year ago

OK, copied that.

And the 2 dashboards are going to be like this

image

image

gordan-bobic commented 1 year ago

Looks great. :-) I am impressed you managed to generate enough load to get it up to 70+ pending writes, that's some serious I/O load. Do we get separate read and write pending numbers? If so, can we split them up and show them stacked, since read and write threads are adjustable independently from each other.

oblitorum commented 1 year ago

I am impressed you managed to generate enough load to get it up to 70+ pending writes, that's some serious I/O load

Yeah, I used a tool call sysbench to generate that enough load

Do we get separate read and write pending numbers? If so, can we split them up and show them stacked, since read and write threads are adjustable independently from each other.

For the pending aio reads/writes output Pending normal aio reads: [2, 3, 1, 2] , aio writes: [1, 0, 1, 0], yes we can get that like what we did before. But for xxx pending reads, xxx pending writes, seems there is no thread info for that, how about we just show it as number/index 0?

gordan-bobic commented 1 year ago

Right, so can we have total queued/pending reads and writes separated rather than combined? Not stacked like I originally said.

oblitorum commented 1 year ago

By separated, you mean one line represents the total pending reads, and one line represents the total pending writes?

Wasn't this already separated as 2 lines?

Or you mean separated as in 2 different dashboards?

gordan-bobic commented 1 year ago

Oh, right. I couldn't see if reads and writes were separate lines in the screenshot. If it already is, great, please don't change..

oblitorum commented 1 year ago

OK, PRs are ready