Closed KJ7LNW closed 9 months ago
Playing with it some more, its also a good indicator of how /sys/block/sda/queue/nr_requests
should be tuned.
We have --disk-avgqu
, --disk-avgrq
, and --disk-wait
. Do these have any of the information you're looking for?
dool --disk --io --disk-avgqu --disk-avgrq --disk-svctm --disk-tps --disk-util --disk-wait 15
There is a lot of data there, and I know what about 2% of it is :)
Both dool_disk_avgrq.py and dool_disk_avgqu.py use /proc/diskstats
but "in-flight" is stored in /sys/block/<dev>/stat
in column 9.
A new plugin would read the inflight info per-disk (from /sys/), as opposed to the way that /proc/diskstats
lists all disks with their status in one file.
Both of those plugins you linked do read from /sys/block/$DEV/stat
, they just use /proc/diskstats
as a starting point to know which disks to look for. Look on lines 38 and 37 of each plugin respectively.
This is my first attempt at writing a dool
plugin, but I think I figured out most of the pieces. Place the attached plugin in your ~/.dool/
directory and try running:
dool --disk-inflight
It's almost impossible for me to test this because none of my disks ever had any in-flight data to read from. Hence the call to random.randrange()
to generate test data. See if that plugin picks up the data you want and let us know.
File: dool_disk_inflight.py.gz
We have a gazillion disks, this is all I get:
# dool --disk-inflight
Terminal width too small, trimming output.
so I specified -D
but I don't think disk-inflight
is using -D
:
# dool --disk-inflight -D bcache0,nvme0n1,nvme1n1,sdb
Terminal width too small, trimming output.
It might work if disk-inflight uses -D
like avgq does:
]# dool --disk-avgqu -D bcache0,nvme0n1,nvme1n1,sdb
bcac┄nvme┄nvme┄sdb┄
avgq avgq avgq avgq
0.18┊0.68┊1.52┊2.11
413┊0.43┊1.00┊ 138
I just added support for -D
and checked it in to the next
branch. Try putting this file in your plug-in dir:
https://github.com/scottchiefbaker/dool/blob/next/plugins/dool_disk_inflight.py
This now filters appropriately: dool --disk-inflight sda,sdb
Looks like its working!
Questions:
--disk-util
, though I like the major heading above the minor heading in --disk-inflight
instead of the other way around. Maybe the existing --disk-* options should be made to look like --disk-inflight
?More compact is always welcome; we have quite a large dool going at all times to watch server load. This is what we run:
nice --20 dool --bytes -N enp193s0f0np0,enp193s0f1np1 -p --top-bio -D bcache0,nvme0n1,nvme1n1,sdb --disk-util --disk-inflight -S total -drcynmgs 1
Thanks, by the way. I really appreciate you writing up a plugin so quickly.
Personally I like the headers the way they are, but I think it makes sense to flip them for consistency. I can make that change pretty easily.
It's super easy to make the columns narrower, but it's going to make the headings really hard to read. Change the self.width = 7
line to whatever number you want and it will shrink the column accordingly. 4
seems to be common for the other disk plugins so we could go with that. It just makes the headers impossible to distinguish.
What do you think?
After some more thinking I was able to come up with a name shortener that I think makes sense. Dool supports a lot of disk types so here is a sample mapping to device names that are four characters:
Raw | Shorter |
---|---|
sda1 | sda1 |
hda14 | hd14 |
vda99 | vd99 |
hdb | hdb |
nvme0n1 | nv01 |
nvme1n1 | nv11 |
md123 | m123 |
md124 | m124 |
md125 | m125 |
mmcblk7p50 | m750 |
VxVM4 | VxV4 |
dm-5 | dm-5 |
I updated the plugin with these changes on the next
branch. If you can test with these new changes and smaller four character column widths that would be much appreciated.
The change seems to have broken "wrap prevention" where it puts a ">" on the right side indicating the window is too small to prevent wrapping. It still puts the ">" but it wraps anyway.
dool --bytes -N enp193s0f0np0,enp193s0f1np1 -p --top-bio -D bcache0,nvme0n1,nvme1n1,sda --disk-util --disk-inflight -S total -drcynmgs 1
I'm still learning the ins and outs of the plugin system. Try that change I just landed and see if it solves your issue.
That looks right, thanks!
I like the new bdev naming convention, too.
About the new naming conventions, not all plugins use the same names. Notably, --disk-util
shows bcac┄nvme┄nvme┄sdb
whereas --disk-inflight
shows bca0┄nv01┄nv11┄sdb┄
.
Some of the other plugins use longer names because they have room (like --disk
), which is fine since they have the room. Maybe it should change, maybe not, but I thought I'd point it out. At least of the ones we use, these are the ones that appear to be different. I've not checked the other --disk-*
plugins:
--disk
--disk-util
I'm not necessarily suggesting that you need to change anything, but I thought I'd let you know since you wanted feedback on the new naming convention change. If you were to make a change, it would be nice to see --disk-util
use the new naming style because these "nvme" names are not distinguishable: bcac┄nvme┄nvme┄sdb
. Perhaps audit the other --disk-*
plugins, too.
The new --disk-inflight
names are nice and readable: bca0┄nv01┄nv11┄sdb
Here's a sample from the -next build:
# dool --bytes -N enp193s0f0np0,enp193s0f1np1 -p --top-bio -D bcache0,nvme0n1,nvme1n1,sdb --disk-util --disk-inflight -S total -drcynmgs 1
SUMMARY
Add in-flight IOs for block devices:
The inflight column was invaluable in solving this hung disk issue. Would be nice to see it in
dool
, perhaps as--disk-inflight
. You can see them live as follows:Note that the inflight data is in
/sys/block/<dev>/stat
but is not available in/proc/diskstats
.ISSUE TYPE
DOOL VERSION
1.3.x
OS / ENVIRONMENT
Oracle Linux 9