Closed olidal closed 1 year ago
I have been investigating a bit this issue and came with a quick and maybe not so dirty fix.
First, looking at the code, I was not able to track the code path the produces the trace above. Ie. I didn't find how and why metaslab_alloc_dva
did end up calling range_tree_remove
. I am not familiar with zfs module debugging, so I probably missed something, eg. maybe a context switch?
Anyway, looking more closely at the error, it happens when an ASSERT detects an attempt to free a zero-sized segment. The question is then: how potentially harmful is this error, and does it need an ASSERT such taht the whole system is blocked.
My impression is that the caller (which I was not able to identify), skipped a zero-size test and didnt prevent the useless call. So I believe it is harmless to just return without blocking everything.
As a temporary mitigation, I tested the following fix, which replaces the original ASSERT with a call to zfs_panic_recover
and activated the zfs_recover
option, such that ZFS operations are not fully interrupted as with assert.
/*
* Turn this into a recoverable error.
* Requesting a null size allocation is certainly wrong, but there is a chance it
* only happens after a test was skipped somewhere that would have prevented to
* call this function.
*/
/* VERIFY3U(size, !=, 0); */
if ( size == 0 ) {
zfs_panic_recover("zfs: freeing 0-sized segment "
"(offset=%llu size=%llu)",
(longlong_t)start, (longlong_t)size);
return;
}
As expected, when the situation occurs, I now have the following warning messages instead of the ASSERT induced PANIC
[14135.170214] WARNING: zfs: freeing 0-sized segment (offset=17179906048 size=0)
[14135.170231] WARNING: zfs: freeing 0-sized segment (offset=68719722496 size=0)
I think this fix may be acceptable because, the severity of the error does not seem to deserve such a strong response as a PANIC. Still it allows for tracking the issue since it still produces a trace in logs.
This fix is still in operation (an will be as long as no better is found), do not hesitate to request action for investigating further.
Seeing the same or similar panic on ZoL 0.8.3 on the CentOS 8 linux 4.18.0-147.5.1.el8_1.x86_64 kernel. Happens during a zfs receive
of a raw encrypted incremental snapshot, and it locks up all zfs receive
processes in a D state with task z_wr_iss:5051 blocked for more than 120 seconds.
.
[6197068.751558] VERIFY3(size != 0) failed (0 != 0)
[6197068.751888] PANIC at range_tree.c:368:range_tree_remove_impl()
[6197068.752488] Showing stack for process 5053
[6197068.753098] CPU: 5 PID: 5053 Comm: z_wr_iss Kdump: loaded Tainted: P OE --------- - - 4.18.0-147.5.1.el8_1.x86_64 #1
[6197068.753709] Hardware name: Hewlett-Packard HP Z230 Tower Workstation/1905, BIOS L51 v01.52 07/20/2015
[6197068.754294] Call Trace:
[6197068.754899] dump_stack+0x5c/0x80
[6197068.755494] spl_panic+0xd3/0xfb [spl]
[6197068.756084] ? apic_timer_interrupt+0xa/0x20
[6197068.756727] ? range_tree_seg_compare+0x9/0x30 [zfs]
[6197068.757318] ? secpolicy_basic_link+0x10/0x10 [zfs]
[6197068.757849] ? avl_find+0x58/0x90 [zavl]
[6197068.758449] range_tree_remove_impl+0x2c4/0x370 [zfs]
[6197068.759022] ? zio_checksum_template_init+0x120/0x120 [zfs]
[6197068.759592] ? abd_fletcher_4_native+0x8c/0xb0 [zfs]
[6197068.760134] ? _cond_resched+0x15/0x30
[6197068.760706] ? __kmalloc_node+0x1d8/0x2b0
[6197068.761289] ? metaslab_df_alloc+0x127/0x1c0 [zfs]
[6197068.761851] metaslab_group_alloc_normal.isra.19+0x348/0x970 [zfs]
[6197068.762414] metaslab_alloc_dva+0x17b/0x730 [zfs]
[6197068.762962] metaslab_alloc+0xb7/0x240 [zfs]
[6197068.763507] zio_dva_allocate+0xcd/0x760 [zfs]
[6197068.764024] ? _cond_resched+0x15/0x30
[6197068.764561] ? mutex_lock+0xe/0x30
[6197068.765120] ? metaslab_class_throttle_reserve+0xc6/0xf0 [zfs]
[6197068.765636] ? tsd_hash_search.isra.3+0x42/0x90 [spl]
[6197068.766251] ? tsd_get_by_thread+0x2a/0x40 [spl]
[6197068.766799] zio_execute+0x90/0xf0 [zfs]
[6197068.767296] taskq_thread+0x2e5/0x530 [spl]
[6197068.767811] ? wake_up_q+0x70/0x70
[6197068.768344] ? zio_taskq_member.isra.8.constprop.13+0x70/0x70 [zfs]
[6197068.768833] ? taskq_thread_spawn+0x50/0x50 [spl]
[6197068.769342] kthread+0x112/0x130
[6197068.769849] ? kthread_flush_work_fn+0x10/0x10
[6197068.770361] ret_from_fork+0x35/0x40
I have implemented and deployed the small hack in my comment above and tested it for a few days. Note this not a fix, it does not solve the cause of this issue, that should still be considered a bug.
After this hack, I now have WARNINGs in logs instead of halting all ZFS operations.
Here is a log of the WARNINGs a got in the recent days (with the small patch above). It shows that the error comes in pairs, which seems to be consistent with my default setting of copies=1.
# dmesg -e | grep zfs
[Apr21 15:51] WARNING: zfs: freeing 0-sized segment (offset=618475339776 size=0)
[ +0.000012] WARNING: zfs: freeing 0-sized segment (offset=1013612331008 size=0)
[Apr21 19:41] WARNING: zfs: freeing 0-sized segment (offset=1219770712064 size=0)
[ +0.000014] WARNING: zfs: freeing 0-sized segment (offset=1340029796352 size=0)
[Apr23 04:46] WARNING: zfs: freeing 0-sized segment (offset=2199023505408 size=0)
[ +0.000013] WARNING: zfs: freeing 0-sized segment (offset=618475290624 size=0)
[Apr23 10:11] WARNING: zfs: freeing 0-sized segment (offset=2130303778816 size=0)
[ +0.000012] WARNING: zfs: freeing 0-sized segment (offset=2164663517184 size=0)
[Apr23 15:51] WARNING: zfs: freeing 0-sized segment (offset=2130303778816 size=0)
[ +0.000014] WARNING: zfs: freeing 0-sized segment (offset=2164663517184 size=0)
[Apr23 16:46] WARNING: zfs: freeing 0-sized segment (offset=618475290624 size=0)
[ +0.000015] WARNING: zfs: freeing 0-sized segment (offset=1889785806848 size=0)
[Apr24 18:16] WARNING: zfs: freeing 0-sized segment (offset=2336462209024 size=0)
[ +0.000013] WARNING: zfs: freeing 0-sized segment (offset=2302102470656 size=0)
[Apr24 21:26] WARNING: zfs: freeing 0-sized segment (offset=2405181685760 size=0)
[ +0.000023] WARNING: zfs: freeing 0-sized segment (offset=2164663517184 size=0)
[Apr25 08:16] WARNING: zfs: freeing 0-sized segment (offset=2903397892096 size=0)
[ +0.000018] WARNING: zfs: freeing 0-sized segment (offset=2370821947392 size=0)
[Apr25 09:56] WARNING: zfs: freeing 0-sized segment (offset=2800318676992 size=0)
[ +0.000074] WARNING: zfs: freeing 0-sized segment (offset=2834678841344 size=0)
[Apr25 14:15] WARNING: zfs: freeing 0-sized segment (offset=2920577925120 size=0)
[ +0.000014] WARNING: zfs: freeing 0-sized segment (offset=2989297238016 size=0)
[Apr25 21:31] WARNING: zfs: freeing 0-sized segment (offset=2869038153728 size=0)
[ +0.000013] WARNING: zfs: freeing 0-sized segment (offset=2765958938624 size=0)
[Apr25 22:26] WARNING: zfs: freeing 0-sized segment (offset=2869038153728 size=0)
[ +0.000020] WARNING: zfs: freeing 0-sized segment (offset=2765958938624 size=0)
[Apr25 23:25] WARNING: zfs: freeing 0-sized segment (offset=2869038153728 size=0)
[ +0.000015] WARNING: zfs: freeing 0-sized segment (offset=2765958938624 size=0)
[Apr26 08:10] WARNING: zfs: freeing 0-sized segment (offset=3075196583936 size=0)
[ +0.000021] WARNING: zfs: freeing 0-sized segment (offset=3401614327808 size=0)
[Apr26 12:31] WARNING: zfs: freeing 0-sized segment (offset=3539053117440 size=0)
[ +0.000014] WARNING: zfs: freeing 0-sized segment (offset=3332895277056 size=0)
[Apr27 01:36] WARNING: zfs: freeing 0-sized segment (offset=3075196829696 size=0)
[ +0.000015] WARNING: zfs: freeing 0-sized segment (offset=3401614327808 size=0)
[Apr27 09:11] WARNING: zfs: freeing 0-sized segment (offset=188978839552 size=0)
[ +0.000015] WARNING: zfs: freeing 0-sized segment (offset=3624952397824 size=0)
Worth to mention: My zfs runs in a vm and uses only a single vdev, so the 2 copies will necessarily end up on the same (virtual) device (I know this is not the best use of ZFS, but the virtualized environment seems to be the only way for receiving snapshots from multiple sources in properly jailed environments. )
Hello, hitting this issue too - twice a month or more. CentOS Linux release 7.9.2009 (Core) Kernel 3.10.0 ZFS 2.0.5 zfs send -w -R -I / zfs receive -Fu send+receive raw encrypted incremental datasets on the same machine triggers this sometimes:
[2070639.496582] VERIFY3(size != 0) failed (0 != 0)
[2070639.496591] PANIC at range_tree.c:437:range_tree_remove_impl()
[2070639.496596] Showing stack for process 1196
[2070639.496602] CPU: 10 PID: 1196 Comm: z_wr_iss Kdump: loaded Tainted: P OE ------------ 3.10.0-1160.36.2.el7.x86_64 #1
[2070639.496605] Hardware name: System manufacturer System Product Name/ROG STRIX X370-F GAMING, BIOS 4012 04/20/2018
[2070639.496607] Call Trace:
[2070639.496618] [<ffffffff9a383559>] dump_stack+0x19/0x1b
[2070639.496651] [<ffffffffc0818c5b>] spl_dumpstack+0x2b/0x30 [spl]
[2070639.496665] [<ffffffffc0818d29>] spl_panic+0xc9/0x110 [spl]
[2070639.496721] [<ffffffffc104d65d>] ? zfs_btree_insert_leaf_impl.isra.10+0x4d/0x60 [zfs]
[2070639.496773] [<ffffffffc104dd73>] ? zfs_btree_insert_into_leaf+0x193/0x220 [zfs]
[2070639.496779] [<ffffffff99e27de9>] ? ___slab_alloc+0x229/0x520
[2070639.496831] [<ffffffffc104cef8>] ? zfs_btree_find_parent_idx+0x88/0x100 [zfs]
[2070639.496883] [<ffffffffc104e5e4>] ? zfs_btree_find+0x1d4/0x370 [zfs]
[2070639.496963] [<ffffffffc10c2e04>] range_tree_remove_impl+0xe24/0xf30 [zfs]
[2070639.497038] [<ffffffffc10b75c9>] ? metaslab_df_alloc+0x399/0x5c0 [zfs]
[2070639.497112] [<ffffffffc10bfbcb>] ? multilist_sublist_unlock+0x2b/0x40 [zfs]
[2070639.497185] [<ffffffffc10c2f40>] range_tree_remove+0x10/0x20 [zfs]
[2070639.497260] [<ffffffffc10b97a5>] metaslab_group_alloc_normal.isra.23+0x775/0x980 [zfs]
[2070639.497343] [<ffffffffc10bafe1>] metaslab_alloc_dva+0x1b1/0x6c0 [zfs]
[2070639.497415] [<ffffffffc10bc097>] metaslab_alloc+0xc7/0x250 [zfs]
[2070639.497506] [<ffffffffc114a036>] zio_dva_allocate+0xf6/0x8f0 [zfs]
[2070639.497583] [<ffffffffc117538f>] ? zio_crypt_encode_params_bp+0x6f/0xa0 [zfs]
[2070639.497665] [<ffffffffc114593a>] ? zio_encrypt+0x19a/0x680 [zfs]
[2070639.497670] [<ffffffff9a3875d2>] ? mutex_lock+0x12/0x2f
[2070639.497685] [<ffffffffc081fbe2>] ? tsd_hash_search.isra.0+0x72/0xd0 [spl]
[2070639.497764] [<ffffffffc114520f>] zio_execute+0x9f/0x100 [zfs]
[2070639.497779] [<ffffffffc081e4f6>] taskq_thread+0x2c6/0x520 [spl]
[2070639.497785] [<ffffffff99cdadd0>] ? wake_up_state+0x20/0x20
[2070639.497862] [<ffffffffc1145170>] ? zio_taskq_member.isra.10.constprop.13+0x70/0x70 [zfs]
[2070639.497878] [<ffffffffc081e230>] ? taskq_thread_spawn+0x60/0x60 [spl]
[2070639.497883] [<ffffffff99cc5e41>] kthread+0xd1/0xe0
[2070639.497887] [<ffffffff99cc5d70>] ? insert_kthread_work+0x40/0x40
[2070639.497893] [<ffffffff9a395de4>] ret_from_fork_nospec_begin+0xe/0x21
[2070639.497897] [<ffffffff99cc5d70>] ? insert_kthread_work+0x40/0x40
Scrubs are clean, the entire zfs system doesn't appear to be locked up certain operations involving the pool do not work, such as a rollback on the receiving pool is hung. However a send job from the original sending pool is working just fine to another off-site receiving pool.
Edit: After a bit more testing, it looks like ZFS operations such as create, destroy, rollback, etc do not work on the local receiving pool. Local sending pool is acting normal. So I assume whatever is stuck here, its stuck on the receive side.
Edit2: Looks like the receive process is still running, 100% CPU. Strace shows nothing at all on the process.
Edit3: Last night I was able to reproduce this three times in a row, same send/receive job caused a panic. This morning I manually sent the remaining datasets to the receiving pool one at a time (instead of the recursive send) and they all went okay. Created a new recursive snapshot to try the entire job again and this time it sent normally. I'm not sure what that means, if there was an issue with the data you'd think that I would have hit it running the datasets one at a time but that didn't happen. Whatever this condition was, it persisted across reboots and it would panic at roughly the same time after I started the recursive send/receive job.
I just hit this same problem with a zfs recv. Running proxmox 7.0, zfs 2.0.5.
I am having this issue on zfs 2.0.6 on ubuntu 21.10, with receiving encrypted raw datasets.
I don't know enough to conclude whether it is this exact one, but today while using syncoid I experienced a similar issue. Debian 11, zfs-dkms
version 2.0.3-9
. Hopefully this helps.
Looks like the receive process is still running, 100% CPU. Strace shows nothing at all on the process.
I was running two sync processes at once and indeed 2 cores are at 100% iowait
I woke up to the same/similar issue this morning on Debian Bullseye running Linux 5.15.0-0.bpo.3-amd64
and ZFS 2.1.4-1~bpo11+1
.
kernel:[377052.452950] VERIFY3(size != 0) failed (0 != 0)
kernel:[377052.457965] PANIC at range_tree.c:438:range_tree_remove_impl()
Unfortunately, I didn't have a serial console hooked up so no call trace :(. I do believe replication was in process when it happened. Anywhere between 1-4 datasets were probably being recv
d concurrently on this machine via zrepl
when it happened.
I had a lot of those at one point when I was using zfs in proxmox LXC containers and I did not pay attention to the fact that zfsutils-linux package in CTs did not match exactly the zfs kernel module in HV. I was convinced that the ioctl protocol between the zfs user-land command and kernel module would be able to detect and deal with such mismatch, but on contrary it seems very sensitive.
I first fixed it the hard way, by selecting and freezing a ZFS version by recompiling the whole zfs project by hand in the proxmox HV, and deploying the libs and user-land commands manually in CTs using bind mounts. It worked well, was very stable, but I was not happy with idea of not following ZFS updates. I now reverted using the mainstream version and proxmox updates, which I manage to keep always identical in CTs and HV (using unattended-upgrades), and have not experienced the issue since then.
Although I now have another issue which I believe is also related to version mismatch: I have a RAIDZ1 pool suspended without any reported I/O fault occurring on vdevs, apparently due to to a txg timeout. I think it may happen following automatic updates with delayed reboot (Ie zfs commands and libs get updated, but ZFS kernel is still using the old module version until next reboot, so we end up with a version mismatch between user-land and kernel). A reboot fixes it up, but I still have to find a way to hook an automatic reboot when zfs package is updated.
Olivier Dalle Co-Fondateur Inspeere
+33 (0) 603-92-19-14 <tel:+33 (0) 603-92-19-14> @. @.> https://inspeere.com https://inspeere.com/ 24 boulevard du Grand Cerf, 86000 Poitiers (FR) https://www.linkedin.com/in/olivierdalle/
On 24 May 2022, at 23:08, Mathias Fredriksson @.***> wrote:
I woke up to the same/similar issue this morning on Debian Bullseye running Linux 5.15.0-0.bpo.3-amd64 and ZFS 2.1.4-1~bpo11+1.
kernel:[377052.452950] VERIFY3(size != 0) failed (0 != 0) kernel:[377052.457965] PANIC at range_tree.c:438:range_tree_remove_impl() Unfortunately, I didn't have a serial console hooked up so no call trace :(. I do believe replication was in process when it happened. Anywhere between 1-4 datasets were probably being recvd concurrently on this machine via zrepl when it happened.
— Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/issues/11893#issuecomment-1136434101, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOPCEE7KEOAOFZ5SVG5JES3VLVANTANCNFSM4224KIZA. You are receiving this because you authored the thread.
I just ran into what I believe to be the same issue while sending an encrypted dataset to a different ZFS pool on the same machine:
[Aug 6 12:04] VERIFY3(size != 0) failed (0 != 0)
[ +0.000028] PANIC at range_tree.c:438:range_tree_remove_impl()
[ +0.000016] Showing stack for process 168596
[ +0.000002] CPU: 2 PID: 168596 Comm: z_wr_iss Tainted: P IO 5.18.16 #1-NixOS
[ +0.000004] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.6.0 05/22/2018
[ +0.000002] Call Trace:
[ +0.000004] <TASK>
[ +0.000004] dump_stack_lvl+0x45/0x5e
[ +0.000011] spl_panic+0xd1/0xe9 [spl]
[ +0.000016] ? zfs_btree_insert_into_leaf+0x232/0x2a0 [zfs]
[ +0.000124] ? pn_free+0x30/0x30 [zfs]
[ +0.000124] ? zfs_btree_find_parent_idx+0x72/0xd0 [zfs]
[ +0.000092] ? metaslab_rangesize64_compare+0x40/0x40 [zfs]
[ +0.000122] ? zfs_btree_find+0x175/0x300 [zfs]
[ +0.000094] range_tree_remove_impl+0xa97/0xef0 [zfs]
[ +0.000125] ? metaslab_df_alloc+0xc4/0x5c0 [zfs]
[ +0.000123] ? recalibrate_cpu_khz+0x10/0x10
[ +0.000006] ? multilist_sublist_insert_tail+0x2b/0x50 [zfs]
[ +0.000126] metaslab_alloc_dva+0xbd2/0x1490 [zfs]
[ +0.000126] ? preempt_count_add+0x70/0xa0
[ +0.000006] metaslab_alloc+0xd3/0x280 [zfs]
[ +0.000125] zio_dva_allocate+0xd3/0x900 [zfs]
[ +0.000141] ? zio_compress_data+0x39/0x100 [zfs]
[ +0.000139] ? zio_encrypt+0x4eb/0x710 [zfs]
[ +0.000140] ? preempt_count_add+0x70/0xa0
[ +0.000004] ? _raw_spin_lock+0x13/0x40
[ +0.000003] zio_execute+0x83/0x120 [zfs]
[ +0.000141] taskq_thread+0x2cf/0x500 [spl]
[ +0.000016] ? wake_up_q+0x90/0x90
[ +0.000005] ? zio_gang_tree_free+0x70/0x70 [zfs]
[ +0.000141] ? taskq_thread_spawn+0x60/0x60 [spl]
[ +0.000011] kthread+0xe8/0x110
[ +0.000005] ? kthread_complete_and_exit+0x20/0x20
[ +0.000005] ret_from_fork+0x22/0x30
[ +0.000006] </TASK>
This happened multiple times before. The OS runs on bare metal, not in a VM and I haven't updated my ZFS userspace tools since the last reboot yesterday:
$ zfs --version
zfs-2.1.5-1
zfs-kmod-2.1.5-1
I think I started seeing this after I configured automatic sends of an encrypted dataset to another pool on the same machine. I tried to enable debugging via --enable-debug
, and the failure changed from
VERIFY3(size != 0) failed (0 != 0)
PANIC at range_tree.c:438:range_tree_remove_impl()
to
VERIFY3(zio->io_size == BP_GET_PSIZE(bp)) failed (0 == 16384)
PANIC at zio.c:3505:zio_dva_allocate()
Unfortunately I'm quite unfamiliar with the internals of zfs, but I'm happy to run with patches that add more logging or debugging and report back.
❯ zfs --version
zfs-2.1.5-1
zfs-kmod-2.1.5-1
I patched in a WARN_ON_ONCE(psize == 0);
in to zio_write
, and it triggered shortly before the panic.
Checking back in the earlier messages, I didn't see anyone share the stuck task report (excluding the zio_execute
thread), which might indicate what zfs is doing at the time of the failure and help work backwards to the cause. Here's an example, just in case it's helpful.
[138854.096402] INFO: task zfs:2790157 blocked for more than 122 seconds.
[138854.096419] Tainted: P W O 5.15.74 #1-NixOS
[138854.096434] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[138854.096453] task:zfs state:D stack: 0 pid:2790157 ppid:2790154 flags:0x00004000
[138854.096455] Call Trace:
[138854.096456] <TASK>
[138854.096456] __schedule+0x2e1/0x1350
[138854.096459] schedule+0x5b/0xd0
[138854.096461] io_schedule+0x42/0x70
[138854.096463] cv_wait_common+0xef/0x2a0 [spl]
[138854.096469] ? finish_wait+0x90/0x90
[138854.096470] txg_wait_synced_impl+0xf2/0x270 [zfs]
[138854.096580] txg_wait_synced+0xc/0x40 [zfs]
[138854.096668] dsl_sync_task_common+0x1b0/0x290 [zfs]
[138854.096738] ? dmu_recv_cleanup_ds+0x1e0/0x1e0 [zfs]
[138854.096799] ? dmu_recv_end_sync+0x5d0/0x5d0 [zfs]
[138854.096860] ? dmu_recv_end_sync+0x5d0/0x5d0 [zfs]
[138854.096920] ? dmu_recv_cleanup_ds+0x1e0/0x1e0 [zfs]
[138854.096981] dsl_sync_task+0x16/0x20 [zfs]
[138854.097051] dmu_recv_existing_end+0x5b/0x90 [zfs]
[138854.097112] ? tsd_hash_dtor+0x73/0x90 [spl]
[138854.097119] ? rrw_exit+0x143/0x2b0 [zfs]
[138854.097191] ? kfree+0xc2/0x250
[138854.097194] ? __cond_resched+0x16/0x50
[138854.097196] ? down_read+0xe/0x80
[138854.097198] ? zvol_find_by_name_hash+0x14b/0x390 [zfs]
[138854.097279] dmu_recv_end+0x8c/0xc0 [zfs]
[138854.097340] zfs_ioc_recv_impl.constprop.0+0xdd3/0x1200 [zfs]
[138854.097422] ? cpumask_next+0x1e/0x30
[138854.097425] ? select_task_rq_fair+0x131/0x1080
[138854.097429] zfs_ioc_recv_new+0x2d9/0x370 [zfs]
[138854.097521] ? nvt_remove_nvpair+0x124/0x270 [znvpair]
[138854.097538] ? __cond_resched+0x16/0x50
[138854.097540] ? __kmalloc_node+0x68/0x470
[138854.097543] ? __kmalloc_node+0x68/0x470
[138854.097545] ? spl_kmem_alloc_impl+0xed/0x100 [spl]
[138854.097550] ? nv_mem_zalloc.isra.0+0x2b/0x40 [znvpair]
[138854.097556] ? nvlist_xalloc.part.0+0x68/0xd0 [znvpair]
[138854.097561] zfsdev_ioctl_common+0x440/0xab0 [zfs]
[138854.097642] zfsdev_ioctl+0x53/0xe0 [zfs]
[138854.097721] __x64_sys_ioctl+0x8a/0xc0
[138854.097723] do_syscall_64+0x3b/0x90
[138854.097726] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[138854.097729] RIP: 0033:0x7f2d526dce37
[138854.097731] RSP: 002b:00007fff9fd54b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[138854.097732] RAX: ffffffffffffffda RBX: 00007fff9fd58168 RCX: 00007f2d526dce37
[138854.097733] RDX: 00007fff9fd54b40 RSI: 0000000000005a46 RDI: 0000000000000005
[138854.097734] RBP: 00007fff9fd58120 R08: 0000000000000003 R09: 0000000000000000
[138854.097735] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000005a46
[138854.097736] R13: 00007fff9fd54b40 R14: 0000000000005a46 R15: 0000000001eb4510
[138854.097739] </TASK>
I am hitting that bug as well on a backup server that receives an encrypted raw snapshot.
uname -a: Linux irulan 5.15.83 #1-NixOS SMP Wed Dec 14 10:37:31 UTC 2022 x86_64 GNU/Linux
ZFS version: ZFS: Loaded module v2.1.7-1, ZFS pool version 5000, ZFS filesystem version 5
Dec 22 00:16:28 irulan kernel: VERIFY3(size != 0) failed (0 != 0)
Dec 22 00:16:28 irulan kernel: PANIC at range_tree.c:438:range_tree_remove_impl()
Dec 22 00:16:28 irulan kernel: Showing stack for process 1179
Dec 22 00:16:29 irulan kernel: CPU: 1 PID: 1179 Comm: z_wr_iss Tainted: P O 5.15.62 #1-NixOS
Dec 22 00:16:29 irulan kernel: Hardware name: /DQ87PG, BIOS PGQ8710H.86A.0155.2018.1031.1654 10/31/2018
Dec 22 00:16:29 irulan kernel: Call Trace:
Dec 22 00:16:29 irulan kernel: <TASK>
Dec 22 00:16:29 irulan kernel: dump_stack_lvl+0x46/0x5e
Dec 22 00:16:29 irulan kernel: spl_panic+0xd1/0xe9 [spl]
Dec 22 00:16:29 irulan kernel: ? fletcher_4_avx2_native+0x18/0x80 [zcommon]
Dec 22 00:16:29 irulan kernel: ? abd_fletcher_4_iter+0x67/0xc0 [zcommon]
Dec 22 00:16:29 irulan kernel: ? abd_iterate_func.part.0+0xdf/0x1c0 [zfs]
Dec 22 00:16:29 irulan kernel: ? fletcher_4_incremental_native+0x160/0x160 [zcommon]
Dec 22 00:16:29 irulan kernel: ? pn_free+0x30/0x30 [zfs]
Dec 22 00:16:29 irulan kernel: ? zfs_btree_find_parent_idx+0x72/0xd0 [zfs]
Dec 22 00:16:29 irulan kernel: ? metaslab_rangesize64_compare+0x40/0x40 [zfs]
Dec 22 00:16:29 irulan kernel: ? zfs_btree_find+0x175/0x300 [zfs]
Dec 22 00:16:29 irulan kernel: range_tree_remove_impl+0xa97/0xef0 [zfs]
Dec 22 00:16:29 irulan kernel: ? metaslab_df_alloc+0xc4/0x5c0 [zfs]
Dec 22 00:16:29 irulan kernel: ? __cond_resched+0x16/0x50
Dec 22 00:16:29 irulan kernel: ? recalibrate_cpu_khz+0x10/0x10
Dec 22 00:16:29 irulan kernel: ? multilist_sublist_insert_tail+0x2b/0x50 [zfs]
Dec 22 00:16:29 irulan kernel: metaslab_alloc_dva+0xb6e/0x13f0 [zfs]
Dec 22 00:16:29 irulan kernel: ? __cond_resched+0x16/0x50
Dec 22 00:16:29 irulan kernel: metaslab_alloc+0xd3/0x280 [zfs]
Dec 22 00:16:29 irulan kernel: zio_dva_allocate+0xd3/0x900 [zfs]
Dec 22 00:16:29 irulan kernel: ? finish_task_switch.isra.0+0x7f/0x290
Dec 22 00:16:29 irulan kernel: ? __schedule+0x2e9/0x1350
Dec 22 00:16:29 irulan kernel: ? zio_encrypt+0x4eb/0x710 [zfs]
Dec 22 00:16:29 irulan kernel: zio_execute+0x83/0x120 [zfs]
Dec 22 00:16:29 irulan kernel: taskq_thread+0x2cf/0x500 [spl]
Dec 22 00:16:29 irulan kernel: ? wake_up_q+0x90/0x90
Dec 22 00:16:29 irulan kernel: ? zio_gang_tree_free+0x70/0x70 [zfs]
Dec 22 00:16:29 irulan kernel: ? taskq_thread_spawn+0x60/0x60 [spl]
Dec 22 00:16:29 irulan kernel: kthread+0x127/0x150
Dec 22 00:16:29 irulan kernel: ? set_kthread_struct+0x50/0x50
Dec 22 00:16:29 irulan kernel: ret_from_fork+0x22/0x30
Dec 22 00:16:29 irulan kernel: </TASK>
I hit this issue on an ARM NAS while it is receiving an incremental snapshot (sent with -I
) of an encrypted dataset.
$ uname -srvmo
Linux 6.0.15 #1-NixOS SMP Wed Dec 21 16:41:16 UTC 2022 armv7l GNU/Linux
$ zfs --version
zfs-2.1.7-1
zfs-kmod-2.1.7-1
Jan 18 19:28:43 kernel: VERIFY3(size != 0) failed (0 != 0)
Jan 18 19:28:43 kernel: PANIC at range_tree.c:435:range_tree_remove_impl()
Jan 18 19:28:43 kernel: Showing stack for process 398
Jan 18 19:28:43 kernel: CPU: 0 PID: 398 Comm: z_wr_iss Tainted: P O 6.0.15 #1-NixOS
Jan 18 19:28:43 kernel: Hardware name: Marvell Armada 370/XP (Device Tree)
Jan 18 19:28:43 kernel: unwind_backtrace from show_stack+0x18/0x1c
Jan 18 19:28:43 kernel: show_stack from dump_stack_lvl+0x40/0x4c
Jan 18 19:28:43 kernel: dump_stack_lvl from spl_panic+0xa8/0xbc [spl]
Jan 18 19:28:43 kernel: spl_panic [spl] from range_tree_remove_impl+0xe1c/0x187c [zfs]
Jan 18 19:28:43 kernel: range_tree_remove_impl [zfs] from range_tree_remove+0x24/0x2c [zfs]
Jan 18 19:28:43 kernel: range_tree_remove [zfs] from metaslab_alloc_dva+0x6b4/0x1640 [zfs]
Jan 18 19:28:43 kernel: metaslab_alloc_dva [zfs] from metaslab_alloc+0xd0/0x2ac [zfs]
Jan 18 19:28:43 kernel: metaslab_alloc [zfs] from zio_dva_allocate+0xc4/0x8ec [zfs]
Jan 18 19:28:43 kernel: zio_dva_allocate [zfs] from zio_execute+0x98/0x12c [zfs]
Jan 18 19:28:43 kernel: zio_execute [zfs] from taskq_thread+0x2b0/0x4b8 [spl]
Jan 18 19:28:43 kernel: taskq_thread [spl] from kthread+0xd8/0xf4
Jan 18 19:28:43 kernel: kthread from ret_from_fork+0x14/0x2c
Is the fix for this issue shipped in any released zfs 2.1.* version? I am unable to find any reference to this issue or the fixing PR in the changelogs :confused:
System information
Describe the problem you're observing
After printing a panic message, all zfs operations are stuck, the machine needs to be rebooted. See below for panic message.
Note: This is NOT the zfs version that comes with the distribution. This is the locally recompiled zfs-0.8-release tagged version, but recompiled with NO error, using the default compilation settings.
Describe how to reproduce the problem
Hard to tell exactly when and why, but seems to happen under significant load, and most probably when trying to destroy snapshots. After a reboot, it happens again, not immediately, but unpredictably. The stack trace when it reproduces is identical.
I haven't tried yet locally compiled earlier versions, but it seems that earlier versions that came with proxmox kernel 5.4.78 (zfs version 0.8.5) didnt produce this error. (proxmox 6.1.1 or 6.2.1 fresh install without apt update I guess).
Note that it MAY have started after I had once to destroy 13k snapshots that had been accumulating in a dataset. But I don't see why it and I am not totally sure about the timing. Maybe completely unrelated.
Scrub after occurrence completed without error, and it happened again after scrubbing.
Note that most datasets are encrypted. This is a backup server, mostly receiving encrypted snapshots sent in raw replication differential streams. Source streams are also produced using a locally recompiled 0.8 version (I had to downgrade to this version because of another issue on ZFS version 2.0.X that I reported in a separate issue that totally brake our backup scripts).
Include any warning/errors/backtraces from the system logs