RemixVSL / iomemory-vsl

Updated Fusion-io iomemory VSL Linux (version 3.2.16) driver for recent kernels.
150 stars 27 forks source link

[BUG] Upgrade from Ubuntu 18.04 to 20.04 with Kernel 5.10.108 and DKMS #102

Closed thesneakernet closed 2 years ago

thesneakernet commented 2 years ago

Bug description

Driver is installed with DKMS on Ubuntu 20.04 and Kernel 5.10.108 but fio-status -a reports that the device as fct0 instead of the expected FIOA Describe the issue, or paste the full error encountered here. Found 1 ioMemory device in this system Driver version: 3.2.16 build 1731

Adapter: Single Controller Adapter Fusion-io ioScale 3.20TB, Product Number:F11-002-3T20-CS-0001, SN:1352D3913, FIO SN:1352D3913 ioDrive2 Adapter Controller, PN:PA005064001 External Power: NOT connected PCIe Power limit threshold: 24.75W PCIe slot available power: unavailable PCIe negotiated link: 4 lanes at 5.0 Gt/sec each, 2000.00 MBytes/sec total Connected ioMemory modules: fct0: Product Number:F11-002-3T20-CS-0001, SN:1352D3913

fct0 Status unknown: Driver is in MINIMAL MODE: Device has a hardware failure ioDrive2 Adapter Controller, Product Number:F11-002-3T20-CS-0001, SN:1352D3913 !! ---> There are active errors or warnings on this device! Read below for details. ioDrive2 Adapter Controller, PN:PA005064001 SMP(AVR) Versions: App Version: 1.0.9.0, Boot Version: 1.0.5.1 Located in slot 0 Center of ioDrive2 Adapter Controller SN:1352D3913 Powerloss protection: not available PCI:0b:00.0, Slot Number:8 Vendor:1aed, Device:2001, Sub vendor:1aed, Sub device:2001 Firmware v7.1.13, rev 109322 Public Geometry and capacity information not available. Format: not low-level formatted PCIe slot available power: unavailable PCIe negotiated link: 4 lanes at 5.0 Gt/sec each, 2000.00 MBytes/sec total Internal temperature: 63.00 degC, max 63.00 degC Internal voltage: avg 1.02V, max 1.02V Aux voltage: avg 2.50V, max 2.50V Rated PBW: 20.00 PB Lifetime data volumes: Physical bytes written: 0 Physical bytes read : 0 RAM usage: Current: 0 bytes Peak : 0 bytes

    ACTIVE WARNINGS:
        The ioMemory is currently running in a minimal state.

How to reproduce

What are the steps to reproduce the reported issue.

git clone https://github.com/snuf/iomemory-vsl.git
cd iomemory-vsl
git checkout v5.12.1
make dkms
make dkms succeeds and module is installed in kernel. fio-status -a shows the driver installed and persistent on reboot.
Ubuntu 18.04 is able to mount the device and work correctly and read write with no corrupted data.

** poof, broken token **

Possible solution

Is a solution know, or type any plausible suggestions here, if none leave clear.

Environment information

Information about the system the module is used on

  1. Linux kernel compiled against (uname -a) - 5.10.108-0510108-generic #202203230958 SMP Wed Mar 23 11:26:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  2. The C compiler version used (gcc --version) gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
  3. distribution, and version (cat /etc/os-release) NAME="Ubuntu" VERSION="20.04.4 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.4 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal
  4. Tag or Branch of iomemory-vsl that is being compiled - v5.12.1
  5. FIO device used, if applicable
    • fio-status
    • lspci -b -nn
    • Found 1 ioMemory device in this system Driver version: 3.2.16 build 1731

Adapter: Single Controller Adapter Fusion-io ioScale 3.20TB, Product Number:F11-002-3T20-CS-0001, SN:1352D3913, FIO SN:1352D3913 External Power: NOT connected PCIe Power limit threshold: 24.75W Connected ioMemory modules: fct0: Product Number:F11-002-3T20-CS-0001, SN:1352D3913

fct0 Status unknown: Driver is in MINIMAL MODE: Device has a hardware failure ioDrive2 Adapter Controller, Product Number:F11-002-3T20-CS-0001, SN:1352D3913 !! ---> There are active errors or warnings on this device! Read below for details. Located in slot 0 Center of ioDrive2 Adapter Controller SN:1352D3913 PCI:0b:00.0, Slot Number:8 Firmware v7.1.13, rev 109322 Public Geometry and capacity information not available. Internal temperature: 63.00 degC, max 63.00 degC

    ACTIVE WARNINGS:
        The ioMemory is currently running in a minimal state.
snuf commented 2 years ago

Hi @thesneakernet, thanks for your bug report. Can you please provide the logs of the driver loading from your kern.log/dmesg. It seems like there is an error with the drive, and the log would probably show us what.

snuf commented 2 years ago

@thesneakernet it also seems like your firmware is not up to date Firmware v7.1.13, rev 109322 Public. Please see if you can update the firmware. The firmware and driver are a pair.

thesneakernet commented 2 years ago

fio.txt

thesneakernet commented 2 years ago

Hi Snuf, Thanks for your fast responses! I really appreciate it. I've included my DMESG and I filtered it down to just FIO lines. How do I update the firmware? I can't find any documentation on how to do that.

thesneakernet commented 2 years ago

I figured out how to update the firmware. Doing that now.

thesneakernet commented 2 years ago

Firmware updated. DMESG below 8.847931] kernel: <6>fioinf VSL configuration hash: 8f82ea05bdf1195cb400fb48e4ef09fc49b3c1aa [ 8.848424] kernel: <6>fioinf [ 8.848426] kernel: <6>fioinf Copyright (c) 2006-2014 Fusion-io, Inc. (acquired by SanDisk Corp. 2014) [ 8.848427] kernel: <6>fioinf Copyright (c) 2014-2016 SanDisk Corp. and/or all its affiliates. (acquired by Western Digital Corp. 2016) [ 8.848432] kernel: <6>fioinf Copyright (c) 2016-2018 Western Digital Technologies, Inc. All rights reserved. [ 8.848433] kernel: <6>fioinf For Terms and Conditions see the License file included [ 8.848434] kernel: <6>fioinf with this driver package. [ 8.848435] kernel: <6>fioinf [ 8.848436] kernel: <6>fioinf ioDrive driver ecc66f1 loading... [ 8.856579] kernel: <6>fioinf ioDrive 0000:0b:00.0: mapping controller on BAR 5 [ 8.856745] kernel: <6>fioinf ioDrive 0000:0b:00.0: MSI enabled [ 8.856749] kernel: <6>fioinf ioDrive 0000:0b:00.0: using MSI interrupts [ 8.887685] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Starting master controller [ 8.985406] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: PMP Address: 1 1 1 [ 9.096322] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: SMP Controller Firmware APP version 1.0.21 0 [ 9.096327] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: SMP Controller Firmware BOOT version 1.0.6 1 [ 11.020354] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Required PCIE bandwidth 2.000 GBytes per sec [ 11.020359] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Board serial number is 1352D3913 [ 11.020361] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Adapter serial number is 1352D3913 [ 11.020364] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Default capacity 3200.000 GBytes [ 11.020366] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Default sector size 4096 bytes [ 11.020368] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Rated endurance 20.00 PBytes [ 11.020370] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: 100C temp range hardware found [ 11.020372] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Maximum capacity 3200.000 GBytes [ 12.620590] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832) [ 12.620597] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Platform version 20 [ 12.620599] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware VCS version 116786 [0x1c832] [ 12.620606] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a [ 12.623943] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Powercut flush: Enabled [ 12.864143] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts. [ 12.864147] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Thermal monitoring: Enabled [ 12.864149] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Hardware temperature alarm set for 100C. [ 13.024097] kernel: <6>fioinf ioDrive 0000:0b:00.0: Found device fct0 (Fusion-io ioScale 3.20TB 0000:0b:00.0) on pipeline 0 [ 13.024256] kernel: <3>fioerr Fusion-io ioScale 3.20TB 0000:0b:00.0: failed to map append request [ 13.024258] kernel: <3>fioerr Fusion-io ioScale 3.20TB 0000:0b:00.0: request page program 000000001f87d2ab failed -22 [ 13.680076] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: stuck flush request on startup detected, retry iteration 1 of 3... [ 13.680079] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Starting master controller [ 13.764021] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: PMP Address: 1 1 1 [ 13.912050] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: SMP Controller Firmware APP version 1.0.21 0 [ 13.912053] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: SMP Controller Firmware BOOT version 1.0.6 1 [ 14.628241] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Required PCIE bandwidth 2.000 GBytes per sec [ 14.628245] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Board serial number is 1352D3913 [ 14.628247] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Adapter serial number is 1352D3913 [ 14.628249] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Default capacity 3200.000 GBytes [ 14.628250] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Default sector size 4096 bytes [ 14.628251] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Rated endurance 20.00 PBytes [ 14.628252] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: 100C temp range hardware found [ 14.628253] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Maximum capacity 3200.000 GBytes [ 15.264552] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832) [ 15.264559] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Platform version 20 [ 15.264561] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware VCS version 116786 [0x1c832] [ 15.264568] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a [ 15.267939] kernel: <6>fioinf ioDrive 0000:0b:00.0.0: Powercut flush: Enabled [ 15.437915] kernel: <3>fioerr ioDrive 0000:0b:00.0.0: could not find canonical value across 30 pads [ 15.735586] kernel: <3>fioerr ioDrive 0000:0b:00.0.0: MINIMAL MODE DRIVER: hardware failure. [ 15.908060] kernel: <6>fioinf ioDrive 0000:0b:00.0: Found device fct0 (Fusion-io ioScale 3.20TB 0000:0b:00.0) on pipeline 0 [ 15.908295] kernel: <6>fioinf fct0: stuck flush request got better on retry. [ 15.908297] kernel: <6>fioinf Fusion-io ioScale 3.20TB 0000:0b:00.0: probed fct0 [ 15.908298] kernel: <6>fioinf Fusion-io ioScale 3.20TB 0000:0b:00.0: Attaching explicitly disabled [ 15.908300] kernel: <3>fioerr Fusion-io ioScale 3.20TB 0000:0b:00.0: auto attach failed with error EINVAL: Invalid argument

snuf commented 2 years ago

Fix was in https://github.com/RemixVSL/iomemory-vsl#important-note-for-newer-linux-kernels, closing this.