Raid0 partition and single NVME disks that comprise that partition metrics don't match

vladzcloudius commented 4 months ago

Installation details Panel Name: Disk Writes/Reads Dashboard Name: OS Metrics Scylla-Monitoring Version: 4.7.1 Scylla-Version: 2024.1.3-0.20240401.64115ae91a55 Kernel version on all nodes: 5.15.0-1058-gcp

Description Throughputs (bytes or OPS) of the RAID0 volume (md0 in screenshots below) is supposed to be equal to a sum of corresponding values on physical disks comprising it. However it's far from it. In some cases, like in screenshots below, the corresponding value is even less. In the example below md0 is a RAID0 volume assembled from 4 NVMe disks: nvme0n1,2,3,4

Here is the screenshot showing md0 and only nvme0n1 from all nodes (but the same picture is on all other disks:

Here you can see the values from all disks on a single node clearly showing the problem:

I ran iostat on one of the node trying to see if this is maybe some kernel issue but no, iostat shows values that totally add up:

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz  aqu-sz  %util
md0            323.00  99448.00     0.00   0.00    2.12   307.89    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.68   9.60
nvme0n1         86.00  24836.00     0.00   0.00    2.50   288.79    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.21   6.80
nvme0n2         75.00  23640.00     0.00   0.00    2.33   315.20    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.17   4.80
nvme0n3         80.00  24832.00     0.00   0.00    2.40   310.40    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.19   6.00
nvme0n4         82.00  26140.00     0.00   0.00    2.35   318.78    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.19   6.80
sda              0.00      0.00     0.00   0.00    0.00     0.00  128.00    736.00     3.00   2.29    0.57     5.75    0.00      0.00     0.00   0.00    0.00     0.00    0.07   1.60

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.03    0.00    1.60    0.00    0.00   96.37

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz  aqu-sz  %util
md0            430.00 149624.00     0.00   0.00    2.17   347.96    1.00      4.00     0.00   0.00    0.00     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.93   8.80
nvme0n1         92.00  33124.00     0.00   0.00    2.70   360.04    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.25   7.60
nvme0n2         96.00  33804.00     0.00   0.00    2.41   352.12    1.00      4.00     0.00   0.00    0.00     4.00    0.00      0.00     0.00   0.00    0.00     0.00    0.23   8.40
nvme0n3         92.00  33256.00     0.00   0.00    2.41   361.48    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.22   6.80
nvme0n4        100.00  33056.00     0.00   0.00    2.19   330.56    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.22   6.80

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.37    0.00    1.31    0.00    0.00   96.32

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz  aqu-sz  %util
md0            290.00  98304.00     0.00   0.00    3.52   338.98    4.00     56.00     0.00   0.00    0.00    14.00    0.00      0.00     0.00   0.00    0.00     0.00    1.02   6.80
nvme0n1         88.00  29924.00     0.00   0.00    3.08   340.05    1.00     32.00     0.00   0.00    0.00    32.00    0.00      0.00     0.00   0.00    0.00     0.00    0.27   5.60
nvme0n2         85.00  27560.00     0.00   0.00    2.79   324.24    2.00     16.00     0.00   0.00    0.50     8.00    0.00      0.00     0.00   0.00    0.00     0.00    0.24   6.00
nvme0n3         75.00  28412.00     0.00   0.00    2.77   378.83    1.00      8.00     0.00   0.00    0.00     8.00    0.00      0.00     0.00   0.00    0.00     0.00    0.21   5.60
nvme0n4         92.00  28792.00     0.00   0.00    2.63   312.96    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.24   5.20

We saw similar behavior on multiple clusters.

vladzcloudius commented 4 months ago

cc @tarzanek @vreniers @mkeeneyj

amnonh commented 4 months ago

@vladzcloudius if I get it right, this is a node_exporter issue, right?

vladzcloudius commented 4 months ago

@vladzcloudius if I get it right, this is a node_exporter issue, right?

Could be.

amnonh commented 4 months ago

@vladzcloudius could it be: https://github.com/prometheus/node_exporter/issues/2310

scylladb / scylla-monitoring

Raid0 partition and single NVME disks that comprise that partition metrics don't match #2276