RemixVSL / iomemory-vsl4

Updated Fusion-io iomemory VSL4 Linux (version 4.3.7) driver for recent kernels.
55 stars 9 forks source link

iostat shows weird numbers #31

Closed Magister closed 2 years ago

Magister commented 3 years ago

Bug description

Running iostat on iomemory device shows weird counters. For example, here are a few consecutive runs:

root@pve:/usr/local/src/iomemory-vsl4# iostat -d /dev/fioa
Linux 5.4.106-1-pve (pve)       14.05.21        _x86_64_        (24 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
fioa          9683204258164,84 2420801064541,21 2420801064541,21 868082074056920076 868082074056920076

root@pve:/usr/local/src/iomemory-vsl4# iostat -d /dev/fioa
Linux 5.4.106-1-pve (pve)       14.05.21        _x86_64_        (24 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
fioa          19366345868794,82 4841586467198,70 4841586467198,70 1736164148113840152 1736164148113840152

root@pve:/usr/local/src/iomemory-vsl4# iostat -d /dev/fioa
Linux 5.4.106-1-pve (pve)       14.05.21        _x86_64_        (24 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
fioa              0,00         0,00         0,00        128          0

How to reproduce

Just make some reads/writes to iomemory device and run iostat /dev/fioa

Environment information

Information about the system the module is used on

  1. Linux kernel compiled against (uname -a): Linux pve 5.4.106-1-pve #1 SMP PVE 5.4.106-1 (Fri, 19 Mar 2021 11:08:47 +0100) x86_64 GNU/Linux
  2. The C compiler version used (gcc --version): gcc (Debian 8.3.0-6) 8.3.0
  3. distribution, and version (cat /etc/os-release): Proxmox VE 6.4
  4. Tag or Branch of iomemory-vsl4 that is being compiled: tag: v5.12.0
  5. FIO device used, if applicable
    • fio-status:
      
      # fio-status

Found 2 VSL driver packages: 4.3.7 build 1205 Driver: loaded 3.2.16 build 1731 Driver: not loaded

Found 1 ioMemory device in this system

Adapter: ioMono (driver 4.3.7) ioMemory SX350-1600, Product Number:SDFADAMOS-1T60-SF1, SN:1521G0367, FIO SN:1521G0367 External Power Threshold Override: 74.75W PCIe Power limit threshold: 74.75W Connected ioMemory modules: fct0: 84:00.0, Product Number:SDFADAMOS-1T60-SF1, SN:1521G0367

fct0 Attached ioMemory Adapter Controller, Product Number:SDFADAMOS-1T60-SF1, SN:1521G0367 PCI:84:00.0 Firmware v8.9.9, rev 20180423 Public 1600.00 GBytes device size Internal temperature: 45.28 degC, max 50.69 degC Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00% Contained Virtual Partitions: fioa: ID:0, UUID:c7c241a4-9f79-4ae1-9900-da43e0692f1a

fioa State: Online, Type: block device, Device: /dev/fioa ID:0, UUID:c7c241a4-9f79-4ae1-9900-da43e0692f1a 1600.00 GBytes device size

   * lspci -b -nn

84:00.0 Mass storage controller [0180]: SanDisk ioMemory HHHL [1aed:3002]

snuf commented 3 years ago

@Magister please look at the release notes for v5.12.1 and upgrade.

snuf commented 3 years ago

@Magister did you try updating, and did it work out?

Magister commented 3 years ago

@snuf just tried, much better now, thanks! However it seems that stats are not complete, for example %util is always zero for some reason.

root@pve:~# iostat -dx /dev/fioa
Linux 5.4.114-1-pve (pve)       15.05.21        _x86_64_        (12 CPU)

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
fioa             0,47   18,25     44,38  18345,69     0,00     0,00   0,00   0,00    0,00    0,00   0,00    93,83  1005,22   0,00   0,00

root@pve:~# iostat -dx /dev/fioa
Linux 5.4.114-1-pve (pve)       15.05.21        _x86_64_        (12 CPU)

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
fioa             0,47   19,68     44,19  19763,37     0,00     0,00   0,00   0,00    0,00    0,00   0,00    93,83  1004,46   0,00   0,00

root@pve:~# iostat -dx /dev/fioa
Linux 5.4.114-1-pve (pve)       15.05.21        _x86_64_        (12 CPU)

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
fioa             0,47   19,63     44,09  19719,35     0,00     0,00   0,00   0,00    0,00    0,00   0,00    93,83  1004,46   0,00   0,00

Also, tried to use use_workqueue=3 param hoping to get more stats - but with that device is not attaching to system (probably a thing for another issue?).

snuf commented 3 years ago

@Magister yeah the stats are not complete as in_flight is not "easily" available anymore to manipulate without throwing a tantrum (read borking the kernel) hence it's not there for now. Have looked at a couple of ways of trying to solve it but nothing conclusive yet, but hey at least no more garbage stats for other things. The use_workqueue=3 is another thing I would definitively say, and would have to go back to see what happened there and what the relevance would be.

snuf commented 3 years ago

@Magister the use workqueue=3 was dropped mainly because it was ancient and for a large part in place to support ESXi looking at the original source code. It's something not going to brought bring back because of this. We should probably weed all of the references to it from the codebase and remove it completely. Was there a specific reason you were looking at it?

Magister commented 3 years ago

@snuf yes, looks like with workqueue=3 there will be full stats. See https://support.hpe.com/hpesc/public/docDisplay?docId=kc0130950en_us&docLocale=en_US

snuf commented 2 years ago

We're not reintroduce workqueue=3, there are some fixes that went in for some stats, but not all.