openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.52k stars 1.74k forks source link

Use-after-free read at kfence? (spa_read_history/arc_read) #15068

Open dioni21 opened 1 year ago

dioni21 commented 1 year ago

System information

Type Version/Name
Distribution Name fedora
Distribution Version fc38
Kernel Version 6.3.12-200.fc38.x86_64
Architecture x86_64
OpenZFS Version 2.1.99-1 (master branch, commit: ca960ce56ce1bfe207e4d80ba6e5ab67ea41b32f, AFAIR)

Describe the problem you're observing

Dmesg reports errors, some processes freeze, system unstable

Describe how to reproduce the problem

Don't know. I just use this host daily. It's my desktop/home server.

Here's the last dmesg:

[Fri Jul 14 09:22:52 2023] BUG: KFENCE: use-after-free read in spa_read_history_add+0xea/0x200 [zfs]

[Fri Jul 14 09:22:52 2023] Use-after-free read at 0x00000000da8c07ab (in kfence-#222):
[Fri Jul 14 09:22:52 2023]  spa_read_history_add+0xea/0x200 [zfs]
[Fri Jul 14 09:22:52 2023]  arc_read+0xbe9/0x16d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_issue_final_prefetch+0xcc/0x120 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch_indirect_done+0x251/0x270 [zfs]
[Fri Jul 14 09:22:52 2023]  arc_read+0x10a7/0x16d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch_impl+0x589/0x830 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch+0x13/0x20 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_prefetch+0x1c9/0x210 [zfs]
[Fri Jul 14 09:22:52 2023]  zap_prefetch_uint64+0xd6/0x1b0 [zfs]
[Fri Jul 14 09:22:52 2023]  ddt_prefetch+0xb3/0xf0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_dirty+0x300/0x9d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dnode+0x9e/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dbuf+0x4e/0x70 [zfs]
[Fri Jul 14 09:22:52 2023]  zfs_write+0x4ea/0xc70 [zfs]
[Fri Jul 14 09:22:52 2023]  zpl_iter_write+0x113/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  vfs_write+0x236/0x3f0
[Fri Jul 14 09:22:52 2023]  ksys_write+0x6f/0xf0
[Fri Jul 14 09:22:52 2023]  do_syscall_64+0x5d/0x90
[Fri Jul 14 09:22:52 2023]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[Fri Jul 14 09:22:52 2023] kfence-#222: 0x00000000061205b6-0x0000000000141f18, size=96, cache=kmalloc-96

[Fri Jul 14 09:22:52 2023] allocated by task 1191 on cpu 2 at 117217.504428s:
[Fri Jul 14 09:22:52 2023]  spl_kmem_zalloc+0x10e/0x120 [spl]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch_impl+0x459/0x830 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch+0x13/0x20 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_prefetch+0x1c9/0x210 [zfs]
[Fri Jul 14 09:22:52 2023]  zap_prefetch_uint64+0xd6/0x1b0 [zfs]
[Fri Jul 14 09:22:52 2023]  ddt_prefetch+0xb3/0xf0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_dirty+0x300/0x9d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dnode+0x9e/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dbuf+0x4e/0x70 [zfs]
[Fri Jul 14 09:22:52 2023]  zfs_write+0x4ea/0xc70 [zfs]
[Fri Jul 14 09:22:52 2023]  zpl_iter_write+0x113/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  vfs_write+0x236/0x3f0
[Fri Jul 14 09:22:52 2023]  ksys_write+0x6f/0xf0
[Fri Jul 14 09:22:52 2023]  do_syscall_64+0x5d/0x90
[Fri Jul 14 09:22:52 2023]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[Fri Jul 14 09:22:52 2023] freed by task 1191 on cpu 2 at 117217.504476s:
[Fri Jul 14 09:22:52 2023]  arc_read+0x10a7/0x16d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_issue_final_prefetch+0xcc/0x120 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch_indirect_done+0x251/0x270 [zfs]
[Fri Jul 14 09:22:52 2023]  arc_read+0x10a7/0x16d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch_impl+0x589/0x830 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_prefetch+0x13/0x20 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_prefetch+0x1c9/0x210 [zfs]
[Fri Jul 14 09:22:52 2023]  zap_prefetch_uint64+0xd6/0x1b0 [zfs]
[Fri Jul 14 09:22:52 2023]  ddt_prefetch+0xb3/0xf0 [zfs]
[Fri Jul 14 09:22:52 2023]  dbuf_dirty+0x300/0x9d0 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dnode+0x9e/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  dmu_write_uio_dbuf+0x4e/0x70 [zfs]
[Fri Jul 14 09:22:52 2023]  zfs_write+0x4ea/0xc70 [zfs]
[Fri Jul 14 09:22:52 2023]  zpl_iter_write+0x113/0x190 [zfs]
[Fri Jul 14 09:22:52 2023]  vfs_write+0x236/0x3f0
[Fri Jul 14 09:22:52 2023]  ksys_write+0x6f/0xf0
[Fri Jul 14 09:22:52 2023]  do_syscall_64+0x5d/0x90
[Fri Jul 14 09:22:52 2023]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

[Fri Jul 14 09:22:52 2023] CPU: 2 PID: 1191 Comm: mozStorage #1 Tainted: P    B D W  OE      6.3.12-200.fc38.x86_64 #1
[Fri Jul 14 09:22:52 2023] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-D3P, BIOS FD 02/26/2016
[Fri Jul 14 09:22:52 2023] ==================================================================
dioni21 commented 1 year ago

Forgot to add compilation parameters:

./configure --enable-silent-rules --enable-dependency-tracking --config-cache --enable-linux-builtin --disable-nls --with-config=all --enable-asan --enable-ubsan --enable-debuginfo --enable-debug --enable-debug-kmem --enable-debug-kmem-tracking CFLAGS=-Wno-stringop-overflow

Also, temporarily:

sed -i '/ DEBUG_CFLAGS="-Werror"/s/^/#/' config/zfs-build.m4

As I could not compile since a few recent Fedora upgrades and could not yet get time to fix and send PRs. Sorry... :-(

ThalesBarretto commented 3 months ago

I would also add to this bug report, that i am facing the same issues on arch.

Happened with lts and rolling release kernels. The report focus just on LTS.

Here is the DMESG error.

[ 2758.058252] ==================================================================
[ 2758.058271] BUG: KFENCE: use-after-free read in spa_read_history_add+0xe8/0x200 [zfs]
[ 2758.058467] Use-after-free read at 0x0000000060207b0d (in kfence-#160):
[ 2758.061284] CPU: 2 PID: 6577 Comm: .NET TP Worker Tainted: P           OE      6.6.32-1-lts #1 ee405f31cc2370c66e95bb51982e71a894d4c0fd
[ 2758.061301] Hardware name: ASUS All Series/VANGUARD B85, BIOS 2202 04/01/2015
[ 2758.061310] ==================================================================

zfs version

zfs-2.2.4-1
zfs-kmod-2.2.4-

kernel:

Linux pc 6.6.32-1-lts #1 SMP PREEMPT_DYNAMIC Sat, 25 May 2024 20:20:51 +0000 x86_64 GNU/Linux

I can help but would like to see some interest of the upstream in debugging and fixing it.

@dioni21 could you update on this issue please?