RemixVSL / iomemory-vsl4

Updated Fusion-io iomemory VSL4 Linux (version 4.3.7) driver for recent kernels.
55 stars 9 forks source link

60-persistent-fio.rules doesnt work on proxmox #47

Closed kino0924 closed 1 year ago

kino0924 commented 2 years ago

Bug description

I am trying to add 60-persistent-fio.rules into udev. https://github.com/RemixVSL/iomemory-vsl/issues/74

How to reproduce

What are the steps to reproduce the reported issue.

git clone https://github.com/snuf/iomemory-vsl4.git
cd iomemory-vsl4
make dkms

"reboot"

cp iomemory-vsl4/tools/udev/rules.d/60-persistent-fio.rules /etc/udev/rules.d/

"reboot"

takes forever to boot
and syslog contains this

Aug 25 03:31:32 pve systemd-udevd[574]: fioa1: Worker [588] processing SEQNUM=4954 is taking a long time
Aug 25 03:32:40 pve systemd-udevd[574]: fioa1: Worker [588] processing SEQNUM=4954 killed
Aug 25 03:32:40 pve systemd-udevd[574]: Worker [588] terminated by signal 9 (KILL)
Aug 25 03:32:40 pve systemd-udevd[574]: fioa1: Worker [588] failed

also, despite of persistant-fio.rules, I see this error on dmesg

[   11.889181] ================================================================================
[   11.889244] UBSAN: array-index-out-of-bounds in /var/lib/dkms/iomemory-vsl4/5.15.39-4-fe767c6/build/kcpu.c:257:29
[   11.889274] index 1 is out of range for type 'uint32_t [*]'
[   11.889292] CPU: 0 PID: 319 Comm: kworker/0:5 Tainted: P        W IO      5.15.39-4-pve #1
[   11.889294] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.11.1 04/24/2019
[   11.889296] Workqueue: events work_for_cpu_fn
[   11.889306] Call Trace:
[   11.889308]  <TASK>
[   11.889311]  dump_stack_lvl+0x4a/0x63
[   11.889320]  dump_stack+0x10/0x16
[   11.889322]  ubsan_epilogue+0x9/0x49
[   11.889324]  __ubsan_handle_out_of_bounds.cold+0x44/0x49
[   11.889327]  ? __alloc_pages+0x17b/0x320
[   11.889332]  kfio_map_cpus_to_read_queues+0x1b2/0x630 [iomemory_vsl4]
[   11.889386]  ? dma_direct_alloc+0xee/0x2c0
[   11.889391]  ifio_16f86.0f44428653ea0b4766f7b44db386e2dd47d.4.3.7.1205+0xb8/0xd0 [iomemory_vsl4]
[   11.889427]  ifio_25005.a659bcb9c7a6b0c8868f9462e81739ff7b7.4.3.7.1205+0xff/0x6e0 [iomemory_vsl4]
[   11.889460]  ifio_75996.b4e60ba8ffd03bc396fde18541f7e3f09f3.4.3.7.1205+0xa37/0x1550 [iomemory_vsl4]
[   11.889489]  ? kfio_vmalloc+0xe/0x20 [iomemory_vsl4]
[   11.889510]  ? vmalloc+0x21/0x30
[   11.889516]  ifio_d8156.024cd992cd73ed3fc5751d93a4015c5a808.4.3.7.1205+0x5f8/0x730 [iomemory_vsl4]
[   11.889544]  ? acpi_register_gsi_ioapic+0x94/0x180
[   11.889550]  ? __request_region+0x68/0xa0
[   11.889554]  ? __pci_request_region+0x17a/0x340
[   11.889559]  iodrive_pci_attach+0x2a/0x330 [iomemory_vsl4]
[   11.889586]  ? __pci_request_selected_regions+0x40/0x80
[   11.889590]  iodrive_pci_probe+0x70/0x160 [iomemory_vsl4]
[   11.889608]  local_pci_probe+0x48/0x90
[   11.889611]  work_for_cpu_fn+0x17/0x30
[   11.889615]  process_one_work+0x228/0x3d0
[   11.889617]  worker_thread+0x223/0x420
[   11.889619]  ? process_one_work+0x3d0/0x3d0
[   11.889621]  kthread+0x127/0x150
[   11.889624]  ? set_kthread_struct+0x50/0x50
[   11.889628]  ret_from_fork+0x1f/0x30
[   11.889633]  </TASK>
[   11.889634] ================================================================================

** poof, broken token **

Environment information

Information about the system the module is used on

  1. Linux kernel compiled against (uname -a) Linux pve 5.15.39-4-pve #1 SMP PVE 5.15.39-4 (Mon, 08 Aug 2022 15:11:15 +0200) x86_64 GNU/Linux

  2. The C compiler version used (gcc --version) gcc (Debian 10.2.1-6) 10.2.1 20210110

  3. distribution, and version (cat /etc/os-release) PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/"

  4. Tag or Branch of iomemory-vsl4 that is being compiled main

  5. FIO device used, if applicable

    • fio-status Found 1 VSL driver package: 4.3.7 build 1205 Driver: loaded

Found 1 ioMemory device in this system

Adapter: ioMono (driver 4.3.7) HPE 1.6TB Read Intensive-2 HHHL PCIe Workload Accelerator, Product Number:831735-B21, SN:3UN740K031 PCIe Power limit threshold: 24.75W Connected ioMemory modules: fct0: 17:00.0, Product Number:831735-B21, SN:3UN740K031

fct0 Attached ioMemory Adapter Controller, Product Number:831735-B21, SN:1521G0109 PCI:17:00.0, Slot Number:4 Firmware v8.9.9, rev 20180621 Public 1600.00 GBytes device size Internal temperature: 50.69 degC, max 56.11 degC Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00% Contained Virtual Partitions: fioa: ID:0, UUID:c58ae67e-63e8-4dce-adfe-03b0c8e9a586

fioa State: Online, Type: block device, Device: /dev/fioa ID:0, UUID:c58ae67e-63e8-4dce-adfe-03b0c8e9a586 1600.00 GBytes device size

snuf commented 1 year ago

I've not seen this before. USBAN points at something going wrong with the thread allocation. The main question is if USBAN takes place before udev hangs itself and shoots the fioa1 Worker? that said that piece of code USBAN stumbles on has changed on main, might be worth pulling latest main and checking if that fixes it?

snuf commented 1 year ago

fix for thread issue is in main.