RemixVSL / iomemory-vsl

Updated Fusion-io iomemory VSL Linux (version 3.2.16) driver for recent kernels.
150 stars 27 forks source link

[BUG] ioDrive2 fails to attach on Linux 6.8.4 #128

Closed mx-shift closed 4 months ago

mx-shift commented 4 months ago

Bug description

Proxmox recently updated their Linux kernel to 6.8.4. While current iomemory-vsl master builds fine, it fails during attach and leaves the ioDrive2 in minimal driver mode. Reverting to 6.5.13 gets it working again.

How to reproduce

Load iomemory-vsl under 6.8.4. See the following in dmesg:

[  401.893577] <6>fioinf ioDrive driver 6.5.13-1-0a12f15-3.2.16.1731       loading...
[  401.894421] <6>fioinf ioDrive 0000:02:00.0: mapping controller on BAR 5
[  401.894504] <6>fioinf ioDrive 0000:02:00.0: MSI enabled
[  401.894506] <6>fioinf ioDrive 0000:02:00.0: using MSI interrupts
[  401.924683] <6>fioinf ioDrive 0000:02:00.0.0: Starting master controller
[  401.991177] <6>fioinf ioDrive 0000:02:00.0.0: PMP Address: 1 1 1
[  402.167221] <6>fioinf ioDrive 0000:02:00.0.0: SMP Controller Firmware APP  version 1.0.34 0
[  402.167230] <6>fioinf ioDrive 0000:02:00.0.0: SMP Controller Firmware BOOT version 0.0.9 1
[  404.279254] <6>fioinf ioDrive 0000:02:00.0.0: Required PCIE bandwidth 2.000 GBytes per sec
[  404.279259] <6>fioinf ioDrive 0000:02:00.0.0: Board serial number is 1420D0037
[  404.279260] <6>fioinf ioDrive 0000:02:00.0.0: Adapter serial number is 1420D0037
[  404.279262] <6>fioinf ioDrive 0000:02:00.0.0: Default capacity        1205.000 GBytes
[  404.279263] <6>fioinf ioDrive 0000:02:00.0.0: Default sector size     512 bytes
[  404.279264] <6>fioinf ioDrive 0000:02:00.0.0: Rated endurance         17.00 PBytes
[  404.279266] <6>fioinf ioDrive 0000:02:00.0.0: 100C temp range hardware found
[  404.279267] <6>fioinf ioDrive 0000:02:00.0.0: Maximum capacity        1294.000 GBytes
[  406.149345] <6>fioinf ioDrive 0000:02:00.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832)
[  406.149356] <6>fioinf ioDrive 0000:02:00.0.0: Platform version 16
[  406.149360] <6>fioinf ioDrive 0000:02:00.0.0: Firmware VCS version 116786 [0x1c832]
[  406.149367] <6>fioinf ioDrive 0000:02:00.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a
[  406.152171] <6>fioinf ioDrive 0000:02:00.0.0: Powercut flush: Enabled
[  406.491263] <6>fioinf ioDrive 0000:02:00.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts.
[  406.491270] <6>fioinf ioDrive 0000:02:00.0.0: Thermal monitoring: Enabled
[  406.491272] <6>fioinf ioDrive 0000:02:00.0.0: Hardware temperature alarm set for 100C.
[  406.650305] <6>fioinf ioDrive 0000:02:00.0: Found device fct0 (HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0) on pipeline 0
[  406.650513] <3>fioerr HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0: failed to map append request
[  406.650516] <3>fioerr HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0: request page program 0000000058fe8dd3 failed -22
[  407.554365] <6>fioinf ioDrive 0000:02:00.0.0: stuck flush request on startup detected, retry iteration 1 of 3...
[  407.554372] <6>fioinf ioDrive 0000:02:00.0.0: Starting master controller
[  407.621326] <6>fioinf ioDrive 0000:02:00.0.0: PMP Address: 1 1 1
[  407.797339] <6>fioinf ioDrive 0000:02:00.0.0: SMP Controller Firmware APP  version 1.0.34 0
[  407.797347] <6>fioinf ioDrive 0000:02:00.0.0: SMP Controller Firmware BOOT version 0.0.9 1
[  409.909398] <6>fioinf ioDrive 0000:02:00.0.0: Required PCIE bandwidth 2.000 GBytes per sec
[  409.909407] <6>fioinf ioDrive 0000:02:00.0.0: Board serial number is 1420D0037
[  409.909410] <6>fioinf ioDrive 0000:02:00.0.0: Adapter serial number is 1420D0037
[  409.909415] <6>fioinf ioDrive 0000:02:00.0.0: Default capacity        1205.000 GBytes
[  409.909418] <6>fioinf ioDrive 0000:02:00.0.0: Default sector size     512 bytes
[  409.909421] <6>fioinf ioDrive 0000:02:00.0.0: Rated endurance         17.00 PBytes
[  409.909424] <6>fioinf ioDrive 0000:02:00.0.0: 100C temp range hardware found
[  409.909427] <6>fioinf ioDrive 0000:02:00.0.0: Maximum capacity        1294.000 GBytes
[  412.021482] <6>fioinf ioDrive 0000:02:00.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832)
[  412.021493] <6>fioinf ioDrive 0000:02:00.0.0: Platform version 16
[  412.021496] <6>fioinf ioDrive 0000:02:00.0.0: Firmware VCS version 116786 [0x1c832]
[  412.021503] <6>fioinf ioDrive 0000:02:00.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a
[  412.024258] <6>fioinf ioDrive 0000:02:00.0.0: Powercut flush: Enabled
[  412.279549] <3>fioerr ioDrive 0000:02:00.0.0: could not find canonical value across 24 pads
[  412.579673] <3>fioerr ioDrive 0000:02:00.0.0: MINIMAL MODE DRIVER: hardware failure.
[  412.711446] <6>fioinf ioDrive 0000:02:00.0: Found device fct0 (HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0) on pipeline 0
[  412.711629] <6>fioinf fct0: stuck flush request got better on retry.
[  412.711631] <6>fioinf HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0: probed fct0
[  412.711633] <6>fioinf HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0: Attaching explicitly disabled
[  412.711636] <3>fioerr HP 1205GB MLC PCIe ioDrive2 for ProLiant Servers 0000:02:00.0: auto attach failed with error EINVAL: Invalid argument

Environment information

Information about the system the module is used on

  1. Linux kernel compiled against (uname -a): 6.8.4-2-pve
  2. The C compiler version used (gcc --version): gcc (Debian 12.2.0-14) 12.2.0
  3. distribution, and version (cat /etc/os-release): Proxmox 8.2 (Debian 12)
  4. Tag or Branch of iomemory-vsl that is being compiled: 0a12f1507186a8088697588a8f9462babbf8279e
  5. FIO device used, if applicable
    • fio-status: ioDrive2 Adapter Controller, Product Number:00D8407, SN:1349D016A
    • lspci -b -nn: 0a:00.0 Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)
snuf commented 4 months ago

@mx-shift driver has been tested on arch with kernel 6.8.1 and works as it should:

[vagrant@fio2 ~]$ uname -a
Linux fio2 6.8.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 16 Mar 2024 17:15:35 +0000 x86_64 GNU/Linux
[vagrant@fio2 ~]$ sudo dmesg | grep fio
[    1.164400] systemd[1]: Hostname set to <fio2>.
[  233.615062] <6>fioinf VSL configuration hash: 8f82ea05bdf1195cb400fb48e4ef09fc49b3c1aa
[  233.615100] <6>fioinf 
[  233.615101] <6>fioinf Copyright (c) 2006-2014 Fusion-io, Inc. (acquired by SanDisk Corp. 2014)
[  233.615102] <6>fioinf Copyright (c) 2014-2016 SanDisk Corp. and/or all its affiliates. (acquired by Western Digital Corp. 2016)
[  233.615103] <6>fioinf Copyright (c) 2016-2018 Western Digital Technologies, Inc. All rights reserved.
[  233.615104] <6>fioinf For Terms and Conditions see the License file included
[  233.615105] <6>fioinf with this driver package.
[  233.615106] <6>fioinf 
[  233.615107] <6>fioinf ioDrive driver 6.8.1-arch1-0a12f15-3.2.16.1731    loading...
[  233.660365] <6>fioinf ioDrive 0000:00:04.0: mapping controller on BAR 5
[  233.674883] <6>fioinf ioDrive 0000:00:04.0: MSI enabled
[  233.675301] <6>fioinf ioDrive 0000:00:04.0: using MSI interrupts
[  233.705758] <6>fioinf ioDrive 0000:00:04.0.0: Starting master controller
[  234.755443] <6>fioinf ioDrive 0000:00:04.0.0: Adapter serial number is 440178
[  235.807267] <6>fioinf ioDrive 0000:00:04.0.0: Board serial number is 443248
[  235.807272] <6>fioinf ioDrive 0000:00:04.0.0: Default capacity        320.000 GBytes
[  235.807274] <6>fioinf ioDrive 0000:00:04.0.0: Default sector size     512 bytes
[  235.807275] <6>fioinf ioDrive 0000:00:04.0.0: Rated endurance         5.00 PBytes
[  235.807276] <6>fioinf ioDrive 0000:00:04.0.0: 85C temp range hardware found
[  235.812869] <6>fioinf ioDrive 0000:00:04.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832)
[  235.812872] <6>fioinf ioDrive 0000:00:04.0.0: Platform version 10 
[  235.812873] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS version 116786 [0x1c832]
[  235.812886] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a
[  235.815306] <6>fioinf ioDrive 0000:00:04.0.0: Powercut flush: Enabled
[  235.919582] <6>fioinf ioDrive 0000:00:04.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts.
[  235.919589] <6>fioinf ioDrive 0000:00:04.0.0: Thermal monitoring: Enabled
[  235.919592] <6>fioinf ioDrive 0000:00:04.0.0: Hardware temperature alarm set for 85C.
[  235.925594] <6>fioinf ioDrive 0000:00:04.0: Found device fct0 (Fusion-io ioDrive Duo 640GB 0000:00:04.0) on pipeline 0
[  236.785698] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: probed fct0
[  236.871165] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: sector_size=4096
[  236.871170] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting channel range data to [2 .. 2047]
[  236.905814] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Found metadata in EBs 1416-1416, loading...
[  236.936095] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting recovered append point 1416+198180864
[  236.939577] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating device of size 320000000000 bytes with 78125000 sectors of 4096 bytes (39081617 mapped).
[  236.945157] fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating block device fioa: major: 252 minor: 0 sector size: 4096...
[  236.946079]  fioa: fioa1
[  236.946220] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Attach succeeded.
[vagrant@fio2 bin]$ sudo ./fio-status -a

Found 1 ioMemory device in this system with 1 ioDrive Duo
Driver version: 3.2.16 build 1731

Adapter: Dual Adapter
    Fusion-io ioDrive Duo 640GB, Product Number:FS3-204-320-CS, SN:440178, FIO SN:440178
    ioDrive Duo HL, PN:00190000108
    External Power: NOT connected
    PCIe Bus voltage: avg 12.06V min 12.01V max 12.12V
    PCIe Bus current: avg 0.87A max 1.67A
    PCIe Bus power: avg 10.55W max 20.12W
    PCIe Power limit threshold: 24.75W
    PCIe slot available power: unavailable
    Connected ioMemory modules:
      fct0: SN:443248

fct0    Attached
    ioDIMM3 320G MLC, SN:443248
    ioDIMM3 320G MLC, PN:00276700903
    Located in slot 0 Upper of ioDrive Duo HL SN:440178
    Powerloss protection: protected
    PCI:00:04.0, Slot Number:3
    Vendor:1aed, Device:1005, Sub vendor:1aed, Sub device:1010
    Firmware v7.1.17, rev 116786 Public
    320.00 GBytes device size
    Format: v500, 78125000 sectors of 4096 bytes
    PCIe slot available power: 25.00W
    PCIe negotiated link: 4 lanes at 2.5 Gt/sec each, 1000.00 MBytes/sec total
    Internal temperature: 49.22 degC, max 49.22 degC
    Internal voltage: avg 1.02V, max 1.03V
    Aux voltage: avg 2.48V, max 2.48V
    Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00%
    Active media: 100.00%
    Rated PBW: 5.00 PB, 99.35% remaining
    Lifetime data volumes:
       Physical bytes written: 32,654,478,444,208
       Physical bytes read   : 36,373,766,874,376
    RAM usage:
       Current: 38,098,432 bytes
       Peak   : 38,098,432 bytes
    Contained VSUs:
      fioa: ID:0, UUID:67dde0f8-26ce-48fb-bca1-a2e1bcb213ea

fioa    State: Online, Type: block device
    ID:0, UUID:67dde0f8-26ce-48fb-bca1-a2e1bcb213ea
    320.00 GBytes device size
    Format: 78125000 sectors of 4096 bytes

[vagrant@fio2 bin]$ 

Will have a look at proxmox to see if there is anything to reproduce with the card I have.

snuf commented 4 months ago

@mx-shift seems like proxmox with 6.8.4-2 is also fine, so I can't reproduce it on my end (driver load log below). Any way you can reproduce the behavior you're seeing, as in does it happen every time you boot with 6.8?

Linux fio2 6.8.4-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) x86_64 GNU/Linux
vagrant@fio2:~/iomemory-vsl/root/usr/src/iomemory-vsl-3.2.16$ sudo dmesg | grep fio
[    1.744460] systemd[1]: Hostname set to <fio2>.
[  127.464299] <6>fioinf VSL configuration hash: 8f82ea05bdf1195cb400fb48e4ef09fc49b3c1aa
[  127.464910] <6>fioinf 
[  127.465088] <6>fioinf Copyright (c) 2006-2014 Fusion-io, Inc. (acquired by SanDisk Corp. 2014)
[  127.465709] <6>fioinf Copyright (c) 2014-2016 SanDisk Corp. and/or all its affiliates. (acquired by Western Digital Corp. 2016)
[  127.466512] <6>fioinf Copyright (c) 2016-2018 Western Digital Technologies, Inc. All rights reserved.
[  127.467172] <6>fioinf For Terms and Conditions see the License file included
[  127.467669] <6>fioinf with this driver package.
[  127.468003] <6>fioinf 
[  127.468181] <6>fioinf ioDrive driver 6.8.4-2-0a12f15-3.2.16.1731        loading...
[  127.509923] <6>fioinf ioDrive 0000:00:04.0: mapping controller on BAR 5
[  127.529562] <6>fioinf ioDrive 0000:00:04.0: MSI enabled
[  127.530098] <6>fioinf ioDrive 0000:00:04.0: using MSI interrupts
[  127.561224] <6>fioinf ioDrive 0000:00:04.0.0: Starting master controller
[  128.621685] <6>fioinf ioDrive 0000:00:04.0.0: Adapter serial number is 440178
[  129.678504] <6>fioinf ioDrive 0000:00:04.0.0: Board serial number is 443248
[  129.679034] <6>fioinf ioDrive 0000:00:04.0.0: Default capacity        320.000 GBytes
[  129.679579] <6>fioinf ioDrive 0000:00:04.0.0: Default sector size     512 bytes
[  129.680105] <6>fioinf ioDrive 0000:00:04.0.0: Rated endurance         5.00 PBytes
[  129.680643] <6>fioinf ioDrive 0000:00:04.0.0: 85C temp range hardware found
[  129.686804] <6>fioinf ioDrive 0000:00:04.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832)
[  129.687684] <6>fioinf ioDrive 0000:00:04.0.0: Platform version 10 
[  129.688123] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS version 116786 [0x1c832]
[  129.688684] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a
[  129.691917] <6>fioinf ioDrive 0000:00:04.0.0: Powercut flush: Enabled
[  129.797107] <6>fioinf ioDrive 0000:00:04.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts.
[  129.797875] <6>fioinf ioDrive 0000:00:04.0.0: Thermal monitoring: Enabled
[  129.798359] <6>fioinf ioDrive 0000:00:04.0.0: Hardware temperature alarm set for 85C.
[  129.804923] <6>fioinf ioDrive 0000:00:04.0: Found device fct0 (Fusion-io ioDrive Duo 640GB 0000:00:04.0) on pipeline 0
[  130.617162] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: probed fct0
[  130.702411] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: sector_size=4096
[  130.702994] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting channel range data to [2 .. 2047]
[  130.738848] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Found metadata in EBs 1794-1794, loading...
[  130.769281] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting recovered append point 1794+198180864
[  130.773678] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating device of size 320000000000 bytes with 78125000 sectors of 4096 bytes (39081617 mapped).
[  130.776623] fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating block device fioa: major: 251 minor: 0 sector size: 4096...
[  130.779017]  fioa: fioa1
[  130.779795] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Attach succeeded.
bulgaru commented 4 months ago

Hey!

This is what i get with Proxmox on kernel update attempt:

Setting up proxmox-kernel-6.8.4-2-pve-signed (6.8.4-2) ...
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/dkms 6.8.4-2-pve /boot/vmlinuz-6.8.4-2-pve
dkms: running auto installation service for kernel 6.8.4-2-pve.
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)
Deprecated feature: REMAKE_INITRD (/etc/dkms/framework.conf)
Sign command: /lib/modules/6.8.4-2-pve/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub
Deprecated feature: REMAKE_INITRD (/var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/source/dkms.conf)

Building module:
Cleaning build area...
'make' DKMS_KERNEL_VERSION=6.8.4-2-pve.....(bad exit status: 2)
Error! Bad return status for module build on kernel: 6.8.4-2-pve (x86_64)
Consult /var/lib/dkms/iomemory-vsl/6.5.11-8-815f890/build/make.log for more information.
Error! One or more modules failed to install during autoinstall.
Refer to previous errors for more information.
dkms: autoinstall for kernel: 6.8.4-2-pve failed!
run-parts: /etc/kernel/postinst.d/dkms exited with return code 11
Failed to process /etc/kernel/postinst.d at /var/lib/dpkg/info/proxmox-kernel-6.8.4-2-pve-signed.postinst line 20.
dpkg: error processing package proxmox-kernel-6.8.4-2-pve-signed (--configure):
 installed proxmox-kernel-6.8.4-2-pve-signed package post-installation script subprocess returned error exit status 2
dpkg: dependency problems prevent configuration of proxmox-kernel-6.8:
 proxmox-kernel-6.8 depends on proxmox-kernel-6.8.4-2-pve-signed | proxmox-kernel-6.8.4-2-pve; however:
  Package proxmox-kernel-6.8.4-2-pve-signed is not configured yet.
  Package proxmox-kernel-6.8.4-2-pve is not installed.
  Package proxmox-kernel-6.8.4-2-pve-signed which provides proxmox-kernel-6.8.4-2-pve is not configured yet.

dpkg: error processing package proxmox-kernel-6.8 (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-default-kernel:
 proxmox-default-kernel depends on proxmox-kernel-6.8; however:
  Package proxmox-kernel-6.8 is not configured yet.

dpkg: error processing package proxmox-default-kernel (--configure):
 dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of proxmox-ve:
 proxmox-ve depends on proxmox-default-kernel; however:
  Package proxmox-default-kernel is not configured yet.

dpkg: error processing package proxmox-ve (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 proxmox-kernel-6.8.4-2-pve-signed
 proxmox-kernel-6.8
 proxmox-default-kernel
 proxmox-ve
E: Sub-process /usr/bin/dpkg returned an error code (1)
mx-shift commented 4 months ago

Yes, it seems to reproduce on every boot on two of my servers. I haven't had a moment to look at where "[ 17.888249] <3>fioerr IBM 1.20TB High IOPS MLC Mono Adapter 0000:0a:00.0: failed to map append request" comes from in the source code. I don't know if that an ioDrive2-specific path or something about this being a 2-socket system.

Rick

On Wed, Apr 24, 2024, 4:31 PM Funs Kessen @.***> wrote:

@mx-shift https://github.com/mx-shift seems like proxmox with 6.8.4-2 is also fine, so I can't reproduce it on my end (driver load log below). Any way you can reproduce the behavior you're seeing, as in does it happen every time you boot with 6.8?

Linux fio2 6.8.4-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.4-2 (2024-04-10T17:36Z) x86_64 GNU/Linux @.***:~/iomemory-vsl/root/usr/src/iomemory-vsl-3.2.16$ sudo dmesg | grep fio [ 1.744460] systemd[1]: Hostname set to . [ 127.464299] <6>fioinf VSL configuration hash: 8f82ea05bdf1195cb400fb48e4ef09fc49b3c1aa [ 127.464910] <6>fioinf [ 127.465088] <6>fioinf Copyright (c) 2006-2014 Fusion-io, Inc. (acquired by SanDisk Corp. 2014) [ 127.465709] <6>fioinf Copyright (c) 2014-2016 SanDisk Corp. and/or all its affiliates. (acquired by Western Digital Corp. 2016) [ 127.466512] <6>fioinf Copyright (c) 2016-2018 Western Digital Technologies, Inc. All rights reserved. [ 127.467172] <6>fioinf For Terms and Conditions see the License file included [ 127.467669] <6>fioinf with this driver package. [ 127.468003] <6>fioinf [ 127.468181] <6>fioinf ioDrive driver 6.8.4-2-0a12f15-3.2.16.1731 loading... [ 127.509923] <6>fioinf ioDrive 0000:00:04.0: mapping controller on BAR 5 [ 127.529562] <6>fioinf ioDrive 0000:00:04.0: MSI enabled [ 127.530098] <6>fioinf ioDrive 0000:00:04.0: using MSI interrupts [ 127.561224] <6>fioinf ioDrive 0000:00:04.0.0: Starting master controller [ 128.621685] <6>fioinf ioDrive 0000:00:04.0.0: Adapter serial number is 440178 [ 129.678504] <6>fioinf ioDrive 0000:00:04.0.0: Board serial number is 443248 [ 129.679034] <6>fioinf ioDrive 0000:00:04.0.0: Default capacity 320.000 GBytes [ 129.679579] <6>fioinf ioDrive 0000:00:04.0.0: Default sector size 512 bytes [ 129.680105] <6>fioinf ioDrive 0000:00:04.0.0: Rated endurance 5.00 PBytes [ 129.680643] <6>fioinf ioDrive 0000:00:04.0.0: 85C temp range hardware found [ 129.686804] <6>fioinf ioDrive 0000:00:04.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832) [ 129.687684] <6>fioinf ioDrive 0000:00:04.0.0: Platform version 10 [ 129.688123] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS version 116786 [0x1c832] [ 129.688684] <6>fioinf ioDrive 0000:00:04.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a [ 129.691917] <6>fioinf ioDrive 0000:00:04.0.0: Powercut flush: Enabled [ 129.797107] <6>fioinf ioDrive 0000:00:04.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts. [ 129.797875] <6>fioinf ioDrive 0000:00:04.0.0: Thermal monitoring: Enabled [ 129.798359] <6>fioinf ioDrive 0000:00:04.0.0: Hardware temperature alarm set for 85C. [ 129.804923] <6>fioinf ioDrive 0000:00:04.0: Found device fct0 (Fusion-io ioDrive Duo 640GB 0000:00:04.0) on pipeline 0 [ 130.617162] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: probed fct0 [ 130.702411] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: sector_size=4096 [ 130.702994] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting channel range data to [2 .. 2047] [ 130.738848] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Found metadata in EBs 1794-1794, loading... [ 130.769281] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: setting recovered append point 1794+198180864 [ 130.773678] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating device of size 320000000000 bytes with 78125000 sectors of 4096 bytes (39081617 mapped). [ 130.776623] fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Creating block device fioa: major: 251 minor: 0 sector size: 4096... [ 130.779017] fioa: fioa1 [ 130.779795] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:00:04.0: Attach succeeded.

— Reply to this email directly, view it on GitHub https://github.com/RemixVSL/iomemory-vsl/issues/128#issuecomment-2076032819, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIEHF3BJUBD7VI4MJBBZF3Y7A6ETAVCNFSM6AAAAABGXJIFSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZWGAZTEOBRHE . You are receiving this because you were mentioned.Message ID: @.***>

mx-shift commented 4 months ago

@bulgaru you need to update iomemory-vsl for it to build at all. There were some recent changes that fixed compilation errors.

bulgaru commented 4 months ago

@mx-shift , your comment came as a blessing! Fixed the issue, running smoothly on Linux 6.8.4-2-pve Thank you very much!

PanosPetrou commented 4 months ago

Great work guys, I can confirm the driver now works with Fedora 40, Kernel 6.8.7.

snuf commented 4 months ago

@bulgaru @PanosPetrou are you on multi socket systems or not? Can you also include the dmesg info for loading the driver so I can see what devices you have to compare?

@mx-shift it seems like the problem in your case starts at kfio_sgl_dma_map. When spelunking in the lib I can see that is called before the initial error. There have been some fixes/changes in 6.8 for iommu and dma, this requires some more investigation to see if that applies here or not.

bulgaru commented 4 months ago

Hey, @snuf

All's great, running ioDrive2. Tested with Dell R620 & Dell R630.

PanosPetrou commented 4 months ago

@snuf single socket, ordinary home pc, using an MSI A88X-G41 PC Mate motherboard and a Kaveri CPU. I attached the output of dmesg -T, fio-status -a and neofetch. dmesg.txt fio-status.txt neofetch

mx-shift commented 4 months ago

For reference, I've reproduced this on the following systems:

Supermicro X10SRi-F Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz 02:00.0 Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)

Cisco UCS C240 M4 2x Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz 0a:00.0 Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)

On Thu, Apr 25, 2024, 8:04 AM Funs Kessen @.***> wrote:

@bulgaru https://github.com/bulgaru @PanosPetrou https://github.com/PanosPetrou are you guys on multi socket systems or not? Can you also include the dmesg info for loading the driver so I can see what devices you have to compare?

@mx-shift https://github.com/mx-shift it seems like the problem in your case starts at kfio_sgl_dma_map. When spelunking in the lib I can see that is called before the initial error. There have been some fixes/changes in 6.8 for iommu and dma, this requires some more investigation to see if that applies here or not.

— Reply to this email directly, view it on GitHub https://github.com/RemixVSL/iomemory-vsl/issues/128#issuecomment-2077482949, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIEHF2BWQ26RBWL64NS4ZDY7ELORAVCNFSM6AAAAABGXJIFSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZXGQ4DEOJUHE . You are receiving this because you were mentioned.Message ID: @.***>

snuf commented 4 months ago

@mx-shift after getting my dual socket Supermicro out of storage, dusting it off, jumping trough the proxmox install and setup hoops, I was able to reproduce your issue by removing iommu=pt from the grub command line. Are you sure you have set iommu=pt in your /etc/default/grub and ran updated-grub afterwards ? If not, please set it and let me know if that resolves the issue ? You can verify if it has been set with:

root@pve:/# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-6.8.4-2-pve root=/dev/mapper/pve-root ro quiet iommu=pt

The tested host motherboard is X9DRi-LN4+/X9DR3-LN4+ with the Fusion-io ioDrive2 1.205TB in slot 2 for CPU 2.

mx-shift commented 4 months ago

Indeed, that was it. I can't remember if I ever added it. I may have just been lucky this whole time. Either that or something in the proxmox upgrade removed it. Either way, thank you.