RemixVSL / iomemory-vsl

Updated Fusion-io iomemory VSL Linux (version 3.2.16) driver for recent kernels.
156 stars 28 forks source link

[BUG] Kernel memory overwrite attempt detected to SLUB object 'fusion_user_ll_request' #115

Open fake-name opened 1 year ago

fake-name commented 1 year ago

Bug description

From the title. I'm trying to recover a ioDrive2 with a missing LEB map:

How to reproduce

Built from the current HEAD: f3a056d56e2ff3353cf5fb9c0e2a48f0fb0aebd7 using DKMS

root@flashittysan:~# fio-sure-erase -ty /dev/fct1
WARNING: sanitizing will destroy any existing data on the device!
Erasing blocks: [                    ] (  0%) -
Message from syslogd@flashittysan at Jan 21 18:21:59 ...
 kernel:[   76.037241] usercopy: Kernel memory overwrite attempt detected to SLUB object 'fusion_user_ll_request' (offset 0, size 3960)!
Erasing blocks: [                    ] (  0%) /
[   76.037241] usercopy: Kernel memory overwrite attempt detected to SLUB object 'fusion_user_ll_request' (offset 0, size 3960)!
[   76.037266] ------------[ cut here ]------------
[   76.037268] kernel BUG at mm/usercopy.c:99!
[   76.037276] invalid opcode: 0000 [#1] SMP PTI
[   76.037281] CPU: 0 PID: 1098 Comm: fio-sure-erase Tainted: P           O      5.15.83-1-pve #1
[   76.037287] Hardware name: To be filled by O.E.M. BTC79X5/Intel X79, BIOS 4.6.5 05/06/2021
[   76.037291] RIP: 0010:usercopy_abort+0x78/0x7a
[   76.037302] Code: e0 e1 9b 51 48 0f 45 d6 49 c7 c3 90 cc de 9b 4c 89 d1 57 48 c7 c6 83 43 dd 9b 48 c7 c7 30 cc de 9b 49 0f 45 f3 e8 ca 86 fe ff <0f> 0b 4c 89 e1 49 89 d8 44 89 ea 31 f6 48 29 c1 48 c7 c7 d2 cc de
[   76.037309] RSP: 0018:ffffb1bb8372ba70 EFLAGS: 00010246
[   76.037315] RAX: 0000000000000071 RBX: 0000000000000000 RCX: 0000000000000000
[   76.037320] RDX: 0000000000000000 RSI: ffff8b4677c20580 RDI: ffff8b4677c20580
[   76.037324] RBP: ffffb1bb8372ba88 R08: 0000000000000003 R09: 0000000000000001
[   76.037328] R10: 000000000000000a R11: 79706f6372657375 R12: 0000000000000f78
[   76.037332] R13: ffff8b454402b000 R14: 0000000000000000 R15: 0000000000000f78
[   76.037336] FS:  00007f24eb71f700(0000) GS:ffff8b4677c00000(0000) knlGS:0000000000000000
[   76.037342] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.037346] CR2: 00007f24eb71dd90 CR3: 000000010439e002 CR4: 00000000000606f0
[   76.037351] Call Trace:
[   76.037355]  <TASK>
[   76.037359]  __check_heap_object+0x174/0x1b0
[   76.037368]  __check_object_size+0x14f/0x160
[   76.037376]  kfio_copy_from_user+0x26/0x50 [iomemory_vsl]
[   76.037429]  ifio_7e3a3.ed40884c3016d13c6a23dedee8f561fbc42.3.2.16.1731+0x172/0xa70 [iomemory_vsl]
[   76.037466]  ? fusion_control_ioctl+0x22bb/0x35e0 [iomemory_vsl]
[   76.037506]  fusion_control_ioctl+0x23aa/0x35e0 [iomemory_vsl]
[   76.037544]  ? kernel_init_free_pages.part.0+0x4a/0x70
[   76.037550]  ? get_page_from_freelist+0xc31/0x11d0
[   76.037556]  ? page_counter_cancel+0x2e/0x80
[   76.037562]  ? obj_cgroup_charge_pages+0xf0/0x190
[   76.037566]  ? page_counter_uncharge+0x22/0x40
[   76.037571]  ? drain_stock+0x6d/0xb0
[   76.037577]  ? refill_stock+0xa2/0xb0
[   76.037583]  ? __mod_memcg_lruvec_state+0x63/0xe0
[   76.037589]  ? __alloc_pages+0x17b/0x320
[   76.037594]  ? __mod_lruvec_state+0x37/0x50
[   76.037600]  ? __mod_lruvec_page_state+0x6b/0xb0
[   76.037604]  ? lru_cache_add_inactive_or_unevictable+0x2e/0xe0
[   76.037611]  ? page_add_new_anon_rmap+0x69/0x100
[   76.037617]  ? set_pte+0x9/0x20
[   76.037623]  ? __handle_mm_fault+0x1314/0x15b0
[   76.037629]  ? __fget_files+0x86/0xc0
[   76.037635]  fusion_control_ioctl_internal+0x26/0x40 [iomemory_vsl]
[   76.037668]  __x64_sys_ioctl+0x95/0xd0
[   76.037674]  do_syscall_64+0x5c/0xc0
[   76.037680]  ? exit_to_user_mode_prepare+0x37/0x1b0
[   76.037686]  ? irqentry_exit_to_user_mode+0x9/0x20
[   76.037692]  ? irqentry_exit+0x1d/0x30
[   76.037698]  ? exc_page_fault+0x89/0x170
[   76.037703]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   76.037711] RIP: 0033:0x7f24eb8175f7
[   76.037717] Code: 00 00 00 48 8b 05 99 c8 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 69 c8 0d 00 f7 d8 64 89 01 48
[   76.037724] RSP: 002b:00007f24eb71dd88 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   76.037731] RAX: ffffffffffffffda RBX: 00005604b5a020d0 RCX: 00007f24eb8175f7
[   76.037735] RDX: 00007f24eb71dd90 RSI: ffffffffcf786802 RDI: 0000000000000003
[   76.037739] RBP: 0000000000000000 R08: 00007f24e4002430 R09: 00007f24e4000080
[   76.037743] R10: 00007f24eb71e8f3 R11: 0000000000000246 R12: 00005604b5a01868
[   76.037747] R13: 00005604b5a016d0 R14: 0000000000000000 R15: 0000000000000000
[   76.037752]  </TASK>
[   76.037755] Modules linked in: ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi aesni_intel crypto_simd radeon cryptd rapl intel_cstate snd_hda_codec input_leds drm_ttm_helper pcspkr ttm snd_hda_core snd_hwdep drm_kms_helper cec snd_pcm rc_core i2c_algo_bit snd_timer fb_sys_fops syscopyarea snd sysfillrect soundcore sysimgblt iomemory_vsl(O) ioatdma mac_hid dca zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost vhost_iotlb tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi drm scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic xor zstd_compress raid6_pq simplefb dm_thin_pool
[   76.037839]  dm_persistent_data dm_bio_prison dm_bufio libcrc32c hid_generic usbkbd usbhid hid gpio_ich crc32_pclmul r8169 ahci i2c_i801 realtek libahci lpc_ich i2c_smbus ehci_pci ehci_hcd
[   76.037884] ---[ end trace 0767685c984612c7 ]---
[   76.142582] RIP: 0010:usercopy_abort+0x78/0x7a
[   76.142593] Code: e0 e1 9b 51 48 0f 45 d6 49 c7 c3 90 cc de 9b 4c 89 d1 57 48 c7 c6 83 43 dd 9b 48 c7 c7 30 cc de 9b 49 0f 45 f3 e8 ca 86 fe ff <0f> 0b 4c 89 e1 49 89 d8 44 89 ea 31 f6 48 29 c1 48 c7 c7 d2 cc de
[   76.142602] RSP: 0018:ffffb1bb8372ba70 EFLAGS: 00010246
[   76.142607] RAX: 0000000000000071 RBX: 0000000000000000 RCX: 0000000000000000
[   76.142612] RDX: 0000000000000000 RSI: ffff8b4677c20580 RDI: ffff8b4677c20580
[   76.142616] RBP: ffffb1bb8372ba88 R08: 0000000000000003 R09: 0000000000000001
[   76.142620] R10: 000000000000000a R11: 79706f6372657375 R12: 0000000000000f78
[   76.142624] R13: ffff8b454402b000 R14: 0000000000000000 R15: 0000000000000f78
[   76.142629] FS:  00007f24eb71f700(0000) GS:ffff8b4677c00000(0000) knlGS:0000000000000000
[   76.142634] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   76.142639] CR2: 00007f24eb71dd90 CR3: 000000010439e002 CR4: 00000000000606f0

The erase then seems to either take a LONG time, or not make any progress (it's still at 0% after about an hour).

Possible solution

No idea

Environment information

root@flashittysan:~# uname -a
Linux flashittysan 5.15.83-1-pve #1 SMP PVE 5.15.83-1 (2022-12-15T00:00Z) x86_64 GNU/Linux
root@flashittysan:~# gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

root@flashittysan:~# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
root@flashittysan:~# fio-status

Found 2 ioMemory devices in this system
Driver version: 3.2.16 build 1731

Adapter: Single Controller Adapter
        Fusion-io ioDrive2 1.205TB, Product Number:F00-001-1T20-CS-0001, SN:1213D1984, FIO SN:1213D1984
        External Power: NOT connected
        PCIe Power limit threshold: 24.75W
        Connected ioMemory modules:
          fct0: Product Number:F00-001-1T20-CS-0001, SN:1213D1984

fct0    Attached
        ioDrive2 Adapter Controller, Product Number:F00-001-1T20-CS-0001, SN:1213D1984
        Located in slot 0 Center of ioDrive2 Adapter Controller SN:1213D1984
        PCI:01:00.0, Slot Number:1
        Firmware v7.1.17, rev 116786 Public
        1205.00 GBytes device size
        Internal temperature: 35.93 degC, max 37.40 degC
        Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00%
        Contained VSUs:
          fioa: ID:0, UUID:5cc51c17-ad5e-4501-ba0a-77664c7ccdc6

fioa    State: Online, Type: block device
        ID:0, UUID:5cc51c17-ad5e-4501-ba0a-77664c7ccdc6
        1205.00 GBytes device size

Adapter: Single Controller Adapter
        Fusion-io ioDrive2 1.205TB, Product Number:F00-001-1T20-CS-0001, SN:1213D1652, FIO SN:1213D1652
        External Power: NOT connected
        PCIe Power limit threshold: 24.75W
        Connected ioMemory modules:
          fct1: Product Number:F00-001-1T20-CS-0001, SN:1213D1652

fct1    Detached
        ioDrive2 Adapter Controller, Product Number:F00-001-1T20-CS-0001, SN:1213D1652
        Located in slot 0 Center of ioDrive2 Adapter Controller SN:1213D1652
        PCI:04:00.0, Slot Number:7
        Firmware v7.1.17, rev 116786 Public
        Geometry and capacity information not available.
        Internal temperature: 34.94 degC, max 37.40 degC
root@flashittysan:~# lspci -b -nn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E5/Core i7 DMI2 [8086:3c00] (rev 07)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 1a [8086:3c02] (rev 07)
00:02.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2a [8086:3c04] (rev 07)
00:02.2 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2c [8086:3c06] (rev 07)
00:03.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3a in PCI Express Mode [8086:3c08] (rev 07)
00:03.2 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3c [8086:3c0a] (rev 07)
00:04.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 0 [8086:3c20] (rev 07)
00:04.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 1 [8086:3c21] (rev 07)
00:04.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 2 [8086:3c22] (rev 07)
00:04.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 3 [8086:3c23] (rev 07)
00:04.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 4 [8086:3c24] (rev 07)
00:04.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 5 [8086:3c25] (rev 07)
00:04.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 6 [8086:3c26] (rev 07)
00:04.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DMA Channel 7 [8086:3c27] (rev 07)
00:05.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management [8086:3c28] (rev 07)
00:05.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Control Status and Global Errors [8086:3c2a] (rev 07)
00:05.4 PIC [0800]: Intel Corporation Xeon E5/Core i7 I/O APIC [8086:3c2c] (rev 07)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 05)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b5)
00:1c.4 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 [8086:1c18] (rev b5)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 05)
00:1f.0 ISA bridge [0601]: Intel Corporation H61 Express Chipset LPC Controller [8086:1c5c] (rev 05)
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family 6 port Desktop SATA AHCI Controller [8086:1c02] (rev 05)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 05)
01:00.0 Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RV610 [Radeon HD 2400 PRO/XT] [1002:94c1]
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] RV610 HDMI Audio [Radeon HD 2350 PRO / 2400 PRO/XT / HD 3410] [1002:aa10]
04:00.0 Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)
07:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL810xE PCI Express Fast Ethernet controller [10ec:8136] (rev 05)
ff:08.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link 0 [8086:3c80] (rev 07)
ff:08.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 [8086:3c83] (rev 07)
ff:08.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link Reut 0 [8086:3c84] (rev 07)
ff:09.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link 1 [8086:3c90] (rev 07)
ff:09.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 [8086:3c93] (rev 07)
ff:09.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 QPI Link Reut 1 [8086:3c94] (rev 07)
ff:0a.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Power Control Unit 0 [8086:3cc0] (rev 07)
ff:0a.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Power Control Unit 1 [8086:3cc1] (rev 07)
ff:0a.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Power Control Unit 2 [8086:3cc2] (rev 07)
ff:0a.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Power Control Unit 3 [8086:3cd0] (rev 07)
ff:0b.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Interrupt Control Registers [8086:3ce0] (rev 07)
ff:0b.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Semaphore and Scratchpad Configuration Registers [8086:3ce3] (rev 07)
ff:0c.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0c.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0c.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0c.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 0 [8086:3cf4] (rev 07)
ff:0c.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 System Address Decoder [8086:3cf6] (rev 07)
ff:0d.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0d.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0d.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Unicast Register 0 [8086:3ce8] (rev 07)
ff:0d.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller System Address Decoder 1 [8086:3cf5] (rev 07)
ff:0e.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Processor Home Agent [8086:3ca0] (rev 07)
ff:0e.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7 Processor Home Agent Performance Monitoring [8086:3c46] (rev 07)
ff:0f.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Registers [8086:3ca8] (rev 07)
ff:0f.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller RAS Registers [8086:3c71] (rev 07)
ff:0f.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 0 [8086:3caa] (rev 07)
ff:0f.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 1 [8086:3cab] (rev 07)
ff:0f.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 2 [8086:3cac] (rev 07)
ff:0f.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 3 [8086:3cad] (rev 07)
ff:0f.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Target Address Decoder 4 [8086:3cae] (rev 07)
ff:10.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 0 [8086:3cb0] (rev 07)
ff:10.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 1 [8086:3cb1] (rev 07)
ff:10.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 0 [8086:3cb2] (rev 07)
ff:10.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 1 [8086:3cb3] (rev 07)
ff:10.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 2 [8086:3cb4] (rev 07)
ff:10.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller Channel 0-3 Thermal Control 3 [8086:3cb5] (rev 07)
ff:10.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 2 [8086:3cb6] (rev 07)
ff:10.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Integrated Memory Controller ERROR Registers 3 [8086:3cb7] (rev 07)
ff:11.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 DDRIO [8086:3cb8] (rev 07)
ff:13.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 R2PCIe [8086:3ce4] (rev 07)
ff:13.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7 Ring to PCI Express Performance Monitor [8086:3c43] (rev 07)
ff:13.4 Performance counters [1101]: Intel Corporation Xeon E5/Core i7 QuickPath Interconnect Agent Ring Registers [8086:3ce6] (rev 07)
ff:13.5 Performance counters [1101]: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 0 Performance Monitor [8086:3c44] (rev 07)
ff:13.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7 Ring to QuickPath Interconnect Link 1 Performance Monitor [8086:3c45] (rev 07)

Please excuse the host name, I'm trying to redneck a flash-SAN out of (mostly) used bitcoin miner parts (it's cheap!). The name amuses me. FWIW, it's a cheap way to get 5 PCIe 8x slots, particularly when it's $43 with the CPU on ebay.

Yes, there are 2 ioDrives in this system. I'm dealing with /dev/fct1 here.

The host OS is actually Proxmox 7.3.

bplein commented 1 year ago

This is due to the userspace tools from Fusion-io (fio-sure-erase, fio-format, etc.) which don't know about modern kernels. Since the userspace tools are not open/visible source, we have no way of updating them.

The workaround (and I admit it's a pain) is to boot to a lower kernel, run the tools, then boot back into your OS.

This is where the whole concept of keeping these cards alive is going to hit a brick wall.

My suggestion for folks who are just trying to run some sort of rig to do X, is go back to an older kernel. Unless you need the latest kernel for your project, why go to it? 5.15 is actually EOL before 5.10 as 5.10 is considered more of a long term release.

fake-name commented 1 year ago

Ah, good to know.

This is kind of a silly project, so I'll just switch to Debian 10 for the OS.

My suggestion for folks who are just trying to run some sort of rig to do X, is go back to an older kernel. Unless you need the latest kernel for your project, why go to it? 5.15 is actually EOL before 5.10 as 5.10 is considered more of a long term release.

Apparently the Proxmox people are shipping 5.15 on top of Debian 11, which caught me out a bit. I believe Debian 11 is supposed to be based off 5.10, but I didn't really consider the platform as much as I probably should have.

sakurachan00 commented 1 year ago

Apparently the Proxmox people are shipping 5.15 on top of Debian 11, which caught me out a bit. I believe Debian 11 is supposed to be based off 5.10, but I didn't really consider the platform as much as I probably should have.

and Here I am on PVE kernel 6.1 due to my hardware being EPYC and the performance is there for EPYC, I dont have my fio drive in it however, just adding my 2 cents, its an opt-in kernel, might wanna poke the bear if you really wanted to see if your mem issue goes away

bplein commented 1 year ago

The userspace tools are doing things the newer kernels don't like. I doubt that the kernel devs are going to go backwards in security. The expectation is that apps get rewritten. But the userspace tools are not ours to rewrite.

bdurrow commented 10 months ago

FYI, I got usercopy: Kernel memory overwrite attempt detected to SLUB object 'fusion_user_ll_request' error using Debian "Bullseye" which had kernel 5.10.0-26-amd64. Wondering how far back I need to go to get a working fio-sure-erase. I previously built an image using "Bookworm" which has a 6.x kernel but after finding this issue tried a 5.10 kernel but still no-dice.

snuf commented 10 months ago

@bdurrow if you want the "safest" option use CentOS 7 combined with the more or less vanilla driver source branch. For vsl that would be fio-3.2.16.1732 and for vsl4 fio-original-4.3.7.1205.

It's not just kernel version dependent, but depends on if the CONFIG_HARDENED_USERCOPY flag has been set during compilation of the kernel. This varies per distro and version it seems. Kernels prior to introducing the flag, or distributions that have the flag turned off work.

bdurrow commented 10 months ago

@snuf , Thank you. You probably just saved me another day of frustration, maybe two. My goal is to roll a bootable iso so that I can use the tools. Do you know of one that already exists? If not I'll see about hosting my work so others can use it in the future. Would you like to link to it if I do that?

snuf commented 10 months ago

@bdurrow we were kicking around the idea of creating one for a while, but it's not manifested itself yet. Probably due to life, the universe and everything. I have most of the bits and pieces to go into it I guess, but keep overthinking it probably.

If you have something, and a link I think that would help some folks :)