Closed agrajag9 closed 7 months ago
This may be a hardware error, as setting tag depth to 1 also seems to make the problem go away.
That panic seems to be in the SAS controller driver, based on the mpr_* functions in the backtrace and that panic being in mpr, so I would suggest you file it against FreeBSD.
Already did that. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270089
Great, then I'd suggest you close this, because it's not a bug in OpenZFS, and it's not really something changes in OpenZFS are going to do anything about.
No, please read the report. I stated this exists in Linux as well as FreeBSD, and so it is not likely to be the FreeBSD mpr driver.
And what does the crash on Linux look like?
dmesg: https://gist.github.com/agrajag9/486beef3f993acd975adc1ca0c17ddd9
It imported and exported just fine on the first try. A second import then triggers the error. Linux does not fall all the way over like FreeBSD, eventually recovering and continuing after 30 seconds.
root@debian:~# zpool status
no pools available
root@debian:~# zpool import
pool: ptmc
id: 8660854589035025673
state: ONLINE
status: Some supported features are not enabled on the pool.
(Note that they may be intentionally disabled if the
'compatibility' property is set.)
action: The pool can be imported using its name or numeric identifier, though
some features will not be available without an explicit 'zpool upgrade'.
config:
ptmc ONLINE
mirror-0 ONLINE
sda ONLINE
sdb ONLINE
root@debian:~# zpool import ptmc
root@debian:~# zpool status
pool: ptmc
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: resilvered 3.19G in 04:14:11 with 0 errors on Sat Sep 16 18:48:24 2023
config:
NAME STATE READ WRITE CKSUM
ptmc ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
errors: No known data errors
root@debian:~# zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALT
ptmc 200G 94.9G 105G - 30G 49% 47% 1.00x ONLINE -
root@debian:~# zpool export ptmc
root@debian:~# zpool import ptmc
[ 57.159696] ------------[ cut here ]------------
[ 57.164311] WARNING: CPU: 12 PID: 1794 at drivers/scsi/mpt3sas/mpt3sas_base.]
...
[ 57.505333] el0t_64_sync+0x18c/0x190
[ 57.508984] ---[ end trace 0000000000000000 ]---
[ 87.826353] sd 0:0:0:0: attempting task abort!scmd(0x00000000a6532941), outstanding for 30668 ms & timeout 30000 ms
[ 87.836803] sd 0:0:0:0: [sdb] tag#6360 CDB: Read(10) 28 00 04 00 09 10 00 00 02 00
Right, so something is not right in either this single disk, or in the controller. Linux and FreeBSD drivers are responding to that error differently. Its probably fair to say a panic is not ideal here, but regardless, its vanishingly unlikely that OpenZFS is at fault - we don't generate raw SCSI commands, rather, we just ask the OS block layer to make requests to the disk on our behalf, and the code to do that on Linux and FreeBSD is very different and wholly unrelated.
That still looks like a hardware issue with something confusing your controller and logging an impossible case that is a panic on FBSD, to me.
I'm not saying OpenZFS is flawless, but this really looks like the hardware is the problem, not the software.
I would suspect you can probably trigger a similar problem under load with, say, an mdadm array.
ZFS doesn't have any special knowledge or evil that it does to underlying disk controllers, it just uses the same interfaces as any other filesystem on the OS, so if the controller is barfing, it's probably the case that you can reproduce it without ZFS, too.
Except that I can only reproduce this in the ZFS context. EXT4 on the same setup works fine.
So ZFS is doing something extra.
root@debian:/mnt# df -h -T /mnt
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda2 ext4 32G 8.0K 30G 1% /mnt
root@debian:/mnt# fio --name=TEST --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=8 --size=2g --iodepth=64 --runtime=60 --time_based --end_fsync=1 2>&1 | tee /tmp/fio.log
https://gist.github.com/agrajag9/122f556fc1d3cb6d35046e269825fe0e
I suggested mdadm to replicate hitting multiple disks at once specifically because it seems like the crash message is about the controller getting confused about messages that it doesn't expect to be in the queue being there or vice-versa.
Well then. It's reproducible with mdadm mirrors also.
root@debian:~# mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb1
mdadm: /dev/sda2 appears to contain an ext2fs file system
size=33554432K mtime=Sun Apr 7 10:28:58 2024
mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store '/boot' on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
[27694.045483] md/raid1:md1: not clean -- starting background reconstruction
[27694.052294] md/raid1:md1: active with 2 out of 2 mirrors
[27694.057658] md1: detected capacity change from 0 to 67041280
mdadm: array /dev/md1 started.
[27694.065398] md: resync of RAID array md1
root@debian:~# mkfs.ext4 /dev/md1
mke2fs 1.47.0 (5-Feb-2023)
Discarding device blocks: done
Creating filesystem with 8380160 4k blocks and 2097152 inodes
Filesystem UUID: 05a1c291-f521-4d82-95fb-a6d586d50de2
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
root@debian:~# mount /dev/md1 /mnt
root@debian:/mnt# fio --name=TEST --ioengine=posixaio --rw=randwrite --bs=4k --
TEST: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B4
fio-3.33
Starting 8 processes
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
TEST: Laying out IO file (1 file / 2048MiB)
[27735.671557] ------------[ cut here ]------------
[27735.676182] WARNING: CPU: 15 PID: 0 at drivers/scsi/mpt3sas/mpt3sas_base.c:1]
[27735.686804] Modules linked in: raid1 md_mod binfmt_misc nls_ascii nls_cp437 n
[27735.752722] CPU: 15 PID: 0 Comm: swapper/15 Tainted: P W OE 6.11
[27735.762452] Hardware name: SolidRun Ltd. SolidRun CEX7 Platform, BIOS EDK II3
[27735.770530] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[27735.777480] pc : _get_st_from_smid+0x54/0x70 [mpt3sas]
[27735.782620] lr : _base_get_cb_idx+0x94/0xc0 [mpt3sas]
[27735.787668] sp : ffff80000807be70
[27735.790971] x29: ffff80000807be70 x28: 0000000000000001 x27: 0000000000000000
[27735.798097] x26: 0000000000000006 x25: ffff31bdd2554850 x24: ffffd0f85c221bc8
[27735.805223] x23: 000000000000000f x22: 0000000000000141 x21: ffffffffffffffff
[27735.812348] x20: ffff31bbed4df600 x19: ffff319c388c5408 x18: 0000000000000000
[27735.819474] x17: ffff60c679150000 x16: ffffd0f8b4419460 x15: 0000000000000000
[27735.826599] x14: 0000000000000002 x13: 0000000000000000 x12: 0000000000001176
[27735.833725] x11: 0000000000000040 x10: ffffd0f8b5b5fc88 x9 : ffffd0f85c1df474
[27735.840851] x8 : ffff319c38955c00 x7 : 0000000001fc2488 x6 : 000000a1a83d9644
[27735.847977] x5 : 00ffffffffffffff x4 : 0000000000001b94 x3 : 0000000000001b91
[27735.855102] x2 : ffff31bdd2554850 x1 : 0000000000000000 x0 : ffff31bdd2554850
[27735.862227] Call trace:
[27735.864662] _get_st_from_smid+0x54/0x70 [mpt3sas]
[27735.869455] _base_process_reply_queue+0xb4/0x1084 [mpt3sas]
[27735.875111] _base_interrupt+0x3c/0x50 [mpt3sas]
[27735.879725] __handle_irq_event_percpu+0x60/0x1e0
[27735.884423] handle_irq_event+0x50/0x120
[27735.888334] handle_fasteoi_irq+0xc4/0x210
[27735.892425] generic_handle_domain_irq+0x34/0x50
[27735.897034] gic_handle_irq+0x58/0x13c
[27735.900773] call_on_irq_stack+0x24/0x50
[27735.904686] do_interrupt_handler+0x88/0x90
[27735.908859] el1_interrupt+0x34/0x54
[27735.912427] el1h_64_irq_handler+0x18/0x2c
[27735.916512] el1h_64_irq+0x64/0x68
[27735.919902] cpuidle_enter_state+0xbc/0x440
[27735.924077] cpuidle_enter+0x40/0x60
[27735.927643] do_idle+0x234/0x2c0
[27735.930860] cpu_startup_entry+0x40/0x4c
[27735.934772] secondary_start_kernel+0x130/0x154
[27735.939293] __secondary_switched+0xb0/0xb4
[27735.943467] ---[ end trace 0000000000000000 ]---
...
System information
Type | Version/Name FreeBSD | 15.0-CURRENT Kernel Version | main-n268827-75464941dc17 Architecture | arm64 OpenZFS Version | zfs-2.2.99-365-FreeBSD_g8f2f6cd2a, zfs-kmod-2.2.99-365-FreeBSD_g8f2f6cd2a
Describe the problem you're observing
When attempting to import zpools from some Samsung SSDs attached to an Avago 9400-8i on an 16xA72-based platform, the system panics.
When the drives are attached directly to the SATA bus of the mainboard everything works fine.
The same drives and HBA work fine on multiple amd64 systems.
This is reproducible on both Debian and FreeBSD.
This is reproducible with multiple HBAs: 9211-8i, 9300-8i, and 9400-8i.
The panic goes away when setting
dev.mpr.0.debug_level=0x7ff
in/boot/loader.conf
.Sometimes it happens mid-boot shortly after the mpr module inits, but sometimes it makes it to the full shell just fine.
Describe how to reproduce the problem
Use weird hardware: https://bsd-hardware.info/?probe=7c5ed3c2fe