Open nim-odoo opened 3 years ago
cc @odony @julienlegros
I'm guessing wildly, but what does zpool get all [pool]
return for the pool(s) on one of those systems?
I'm guessing wildly, but what does
zpool get all [pool]
return for the pool(s) on one of those systems?
One of the systems has this:
# zpool get all dpool
NAME PROPERTY VALUE SOURCE
dpool size 1.69T -
dpool capacity 5% -
dpool altroot - default
dpool health ONLINE -
dpool guid 15648689621491837198 -
dpool version - default
dpool bootfs - default
dpool delegation on default
dpool autoreplace off default
dpool cachefile - default
dpool failmode wait default
dpool listsnapshots off default
dpool autoexpand on local
dpool dedupditto 0 default
dpool dedupratio 1.00x -
dpool free 1.59T -
dpool allocated 101G -
dpool readonly off -
dpool ashift 12 local
dpool comment - default
dpool expandsize - -
dpool freeing 0 -
dpool fragmentation 14% -
dpool leaked 0 -
dpool multihost off default
dpool checkpoint - -
dpool load_guid 7007957781465625946 -
dpool autotrim on local
dpool feature@async_destroy enabled local
dpool feature@empty_bpobj active local
dpool feature@lz4_compress active local
dpool feature@multi_vdev_crash_dump enabled local
dpool feature@spacemap_histogram active local
dpool feature@enabled_txg active local
dpool feature@hole_birth active local
dpool feature@extensible_dataset active local
dpool feature@embedded_data active local
dpool feature@bookmarks enabled local
dpool feature@filesystem_limits enabled local
dpool feature@large_blocks enabled local
dpool feature@large_dnode enabled local
dpool feature@sha512 enabled local
dpool feature@skein enabled local
dpool feature@edonr enabled local
dpool feature@userobj_accounting active local
dpool feature@encryption active local
dpool feature@project_quota active local
dpool feature@device_removal enabled local
dpool feature@obsolete_counts enabled local
dpool feature@zpool_checkpoint enabled local
dpool feature@spacemap_v2 active local
dpool feature@allocation_classes enabled local
dpool feature@resilver_defer enabled local
dpool feature@bookmark_v2 enabled local
dpool feature@encryption active local
I hate being right (or at least, not proven wrong).
The only bug I know of in recent memory that can cause what looks for all the world like "hey, this buffer suddenly turned into a NULL pointer when I was midway through operating on it, after it passed various points where it should have crashed first if it were NULL" is #11679, which requires encryption. (The bug, as I currently understand it, basically involves the reference count on a buffer being screwed up, so two places think they have the only pointer to the buffer, and then (among other possible fun outcomes) if one of them frees or destroys it, Bad Times Ensue.)
I do not have any other proof of this being the case, so please don't take this as definitive, but that would be my "thing to investigate next" if I were picking bets at the moment, since I haven't seen any other reports of this turn up.
Thanks a lot for your insight, we'll follow #11679. To be complete, we noticed a slightly different trace in the 3 incidents.
Incident 1:
invalid opcode: 0000 [#1] SMP NOPTI
CPU: 11 PID: 3831033 Comm: postgres Tainted: P O 5.4.0-84-generic #94-Ubuntu
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
...
Call Trace:
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
zio_decompress_data+0x49/0xb0 [zfs]
arc_buf_fill+0x259/0x4a0 [zfs]
arc_buf_alloc_impl.isra.0+0x161/0x250 [zfs]
arc_read+0xea8/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? atomic_dec_return+0x9/0x10 [zfs]
dbuf_read_impl+0x1d2/0x610 [zfs]
? atomic_inc+0xd/0x20 [zfs]
dbuf_read+0xcb/0x5f0 [zfs]
dmu_buf_hold_array_by_dnode+0x113/0x490 [zfs]
dmu_read_uio_dnode+0x49/0xf0 [zfs]
? zrl_add_impl+0x31/0xb0 [zfs]
dmu_read_uio_dbuf+0x47/0x60 [zfs]
zfs_read+0x117/0x300 [zfs]
zpl_read_common_iovec+0x99/0xe0 [zfs]
zpl_iter_read_common+0xa8/0x100 [zfs]
zpl_iter_read+0x58/0x80 [zfs]
new_sync_read+0x122/0x1b0
__vfs_read+0x29/0x40
vfs_read+0xab/0x160
ksys_pread64+0x6d/0xa0
__x64_sys_pread64+0x1e/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Incident 2:
BUG: unable to handle page fault for address: 0000000008d03690
CPU: 8 PID: 2824511 Comm: postgres Tainted: P O 5.4.0-84-generic #94-Ubuntu
RIP: 0010:0x8d03690
Code: Bad RIP value.
Call Trace:
? LZ4_uncompress_unknownOutputSize+0x102/0x270 [zfs]
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
zio_decompress_data+0x49/0xb0 [zfs]
arc_buf_fill+0x259/0x4a0 [zfs]
arc_buf_alloc_impl.isra.0+0x161/0x250 [zfs]
arc_read+0xea8/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? atomic_dec_return+0x9/0x10 [zfs]
dbuf_read_impl+0x1d2/0x610 [zfs]
? atomic_inc+0xd/0x20 [zfs]
dbuf_read+0xcb/0x5f0 [zfs]
dmu_buf_hold_array_by_dnode+0x113/0x490 [zfs]
dmu_read_uio_dnode+0x49/0xf0 [zfs]
? zrl_add_impl+0x31/0xb0 [zfs]
dmu_read_uio_dbuf+0x47/0x60 [zfs]
zfs_read+0x117/0x300 [zfs]
zpl_read_common_iovec+0x99/0xe0 [zfs]
zpl_iter_read_common+0xa8/0x100 [zfs]
zpl_iter_read+0x58/0x80 [zfs]
new_sync_read+0x122/0x1b0
__vfs_read+0x29/0x40
vfs_read+0xab/0x160
ksys_pread64+0x6d/0xa0
__x64_sys_pread64+0x1e/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Incident 3:
BUG: kernel NULL pointer dereference, address: 000000000000000f
CPU: 9 PID: 1258910 Comm: postgres Tainted: P O 5.4.0-84-generic #94-Ubuntu
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x158/0x270 [zfs]
Call Trace:
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
...
(stack trace in first message)
Ah, 4th incident on the same machine than incident 2. The BUG match (unable to handle page fault for address)
kernel tried to execute NX-protected page - exploit attempt? (uid: 115)
BUG: unable to handle page fault for address: ffffbdc80096b888
Oops: 0011 [#1] SMP NOPTI
CPU: 8 PID: 4103458 Comm: postgres Tainted: P O 5.4.0-90-generic #101-Ubuntu
Call Trace:
? lz4_decompress_zfs+0x2b/0x30 [zfs]
? zio_decompress_data_buf+0x3c/0x70 [zfs]
? zio_decompress_data+0x49/0xb0 [zfs]
? arc_buf_fill+0x259/0x4a0 [zfs]
? arc_buf_alloc_impl.isra.0+0x161/0x250 [zfs]
? arc_read+0xea8/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? atomic_dec_return+0x9/0x10 [zfs]
? dbuf_read_impl+0x1d2/0x610 [zfs]
? dbuf_read+0xcb/0x5f0 [zfs]
? dmu_buf_hold_array_by_dnode+0x113/0x490 [zfs]
? dmu_read_uio_dnode+0x49/0xf0 [zfs]
? zrl_add_impl+0x31/0xb0 [zfs]
? dmu_read_uio_dbuf+0x47/0x60 [zfs]
? zfs_read+0x117/0x300 [zfs]
? zpl_read_common_iovec+0x99/0xe0 [zfs]
? zpl_iter_read_common+0xa8/0x100 [zfs]
? zpl_iter_read+0x58/0x80 [zfs]
? new_sync_read+0x122/0x1b0
? __vfs_read+0x29/0x40
? vfs_read+0xab/0x160
? ksys_pread64+0x6d/0xa0
? __x64_sys_pread64+0x1e/0x20
? do_syscall_64+0x57/0x190
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
We have hit the same issue today.
Type | Version/Name |
---|---|
Distribution Name | Ubuntu |
Distribution Version | 20.04.3 LTS |
Kernel Version | 5.4.0-91-generic #102-Ubuntu |
Architecture | x86_64 |
OpenZFS Version | zfs-0.8.3-1ubuntu12.13 |
One noticeable operation that was running at that moment (and froze) was a recv of an unencrypted dataset.
general protection fault: 0000 [#1] SMP NOPTI
CPU: 13 PID: 1129912 Comm: zfs Tainted: P O 5.4.0-91-generic #102-Ubuntu
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C246D4U2-2T, BIOS L2.02K 12/18/2019
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f 84 00 01
RSP: 0018:ffffa8bb345676a8 EFLAGS: 00010297
RAX: ffff953bed6cae09 RBX: ffff953bed6ca20a RCX: ffff953bed6ca202
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff953bed6ca207
RBP: ffffa8bb345676d0 R08: ffff953bed6cae05 R09: 0000000000000001
R10: ffff953bed6cae1c R11: ffff953bed6ca3f8 R12: ffff953bed6ca200
R13: ffff953bed6ca400 R14: ffff953bed6cae14 R15: ffff953bed6ca3f7
FS: 00007f450f2577c0(0000) GS:ffff9547ff540000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f450ed80000 CR3: 0000000abcce2002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
zio_decompress_data+0x49/0xb0 [zfs]
zio_decompress+0x5a/0xd0 [zfs]
zio_pop_transforms+0x74/0x90 [zfs]
zio_done+0x23e/0xe20 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? zio_ready+0x25e/0x3f0 [zfs]
? abd_return_buf_copy+0x31/0x50 [zfs]
zio_nowait+0x9e/0x120 [zfs]
arc_read+0x8cb/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
dbuf_read_impl+0x1d2/0x610 [zfs]
? zio_null+0x2f/0x40 [zfs]
dbuf_read+0xcb/0x5f0 [zfs]
? dmu_buf_hold_noread+0xc5/0x120 [zfs]
dmu_buf_hold+0x62/0x90 [zfs]
zap_lockdir+0x4f/0xb0 [zfs]
zap_lookup_norm+0x5a/0xc0 [zfs]
zap_lookup+0x16/0x20 [zfs]
dsl_prop_get_dd+0x192/0x230 [zfs]
dsl_prop_get_ds+0x172/0x210 [zfs]
? __list_add+0x17/0x40 [zfs]
? redundant_metadata_changed_cb+0x20/0x20 [zfs]
dsl_prop_get_int_ds+0x1e/0x20 [zfs]
dsl_prop_register+0x41/0x160 [zfs]
dmu_objset_open_impl+0x966/0x9f0 [zfs]
dmu_objset_from_ds+0xec/0x180 [zfs]
dmu_objset_hold_flags+0x84/0xd0 [zfs]
dmu_objset_hold+0x16/0x20 [zfs]
zfs_ioc_objset_stats+0x31/0x80 [zfs]
zfsdev_ioctl+0x5c8/0x690 [zfs]
do_vfs_ioctl+0x407/0x670
? do_user_addr_fault+0x216/0x450
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f450f84b50b
Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe94486ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffe94486f10 RCX: 00007f450f84b50b
RDX: 00007ffe94486f10 RSI: 0000000000005a12 RDI: 0000000000000003
RBP: 00007ffe94486f00 R08: 00007f450ed80010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000561440267150
R13: 00007ffe94486f10 R14: 0000561440267150 R15: 00007ffe9448aa80
Modules linked in: xt_multiport binfmt_misc macvlan nf_tables nfnetlink vhost_vsock vmw_vsock_virtio_transport_common vsock vhost uas usb_storage nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) ipmi_ssif zcommon(PO) znvpair(PO) spl(O) kvm_intel kvm input_leds joydev ipmi_si ipmi_devintf ipmi_msghandler video acpi_pad acpi_tad 8021q garp mrp stp llc nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp sch_fq_codel xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter msr ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid raid1 i2c_algo_bit crct10dif_pclmul drm_vram_helper crc32_pclmul ttm ghash_clmulni_intel drm_kms_helper aesni_intel syscopyarea sysfillrect
sysimgblt crypto_simd ixgbe fb_sys_fops cryptd glue_helper nvme xfrm_algo drm dca ahci nvme_core mdio libahci
---[ end trace ec4d2057154f3d6d ]---
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f 84 00 01
RSP: 0018:ffffa8bb345676a8 EFLAGS: 00010297
RAX: ffff953bed6cae09 RBX: ffff953bed6ca20a RCX: ffff953bed6ca202
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff953bed6ca207
RBP: ffffa8bb345676d0 R08: ffff953bed6cae05 R09: 0000000000000001
R10: ffff953bed6cae1c R11: ffff953bed6ca3f8 R12: ffff953bed6ca200
R13: ffff953bed6ca400 R14: ffff953bed6cae14 R15: ffff953bed6ca3f7
FS: 00007f450f2577c0(0000) GS:ffff9547ff540000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f450ed80000 CR3: 0000000abcce2002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
We have hit the same issue today.
Type Version/Name Distribution Name Ubuntu Distribution Version 20.04.3 LTS Kernel Version 5.4.0-91-generic #102-Ubuntu Architecture x86_64 OpenZFS Version zfs-0.8.3-1ubuntu12.13 One noticeable operation that was running at that moment (and froze) was a recv of an unencrypted dataset.
general protection fault: 0000 [#1] SMP NOPTI CPU: 13 PID: 1129912 Comm: zfs Tainted: P O 5.4.0-91-generic #102-Ubuntu Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./E3C246D4U2-2T, BIOS L2.02K 12/18/2019 RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs] Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f 84 00 01 RSP: 0018:ffffa8bb345676a8 EFLAGS: 00010297 RAX: ffff953bed6cae09 RBX: ffff953bed6ca20a RCX: ffff953bed6ca202 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff953bed6ca207 RBP: ffffa8bb345676d0 R08: ffff953bed6cae05 R09: 0000000000000001 R10: ffff953bed6cae1c R11: ffff953bed6ca3f8 R12: ffff953bed6ca200 R13: ffff953bed6ca400 R14: ffff953bed6cae14 R15: ffff953bed6ca3f7 FS: 00007f450f2577c0(0000) GS:ffff9547ff540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f450ed80000 CR3: 0000000abcce2002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: lz4_decompress_zfs+0x2b/0x30 [zfs] zio_decompress_data_buf+0x3c/0x70 [zfs] zio_decompress_data+0x49/0xb0 [zfs] zio_decompress+0x5a/0xd0 [zfs] zio_pop_transforms+0x74/0x90 [zfs] zio_done+0x23e/0xe20 [zfs] ? __raw_spin_unlock+0x9/0x10 [zfs] ? zio_ready+0x25e/0x3f0 [zfs] ? abd_return_buf_copy+0x31/0x50 [zfs] zio_nowait+0x9e/0x120 [zfs] arc_read+0x8cb/0x1100 [zfs] ? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs] ? do_raw_spin_unlock+0x9/0x10 [zfs] dbuf_read_impl+0x1d2/0x610 [zfs] ? zio_null+0x2f/0x40 [zfs] dbuf_read+0xcb/0x5f0 [zfs] ? dmu_buf_hold_noread+0xc5/0x120 [zfs] dmu_buf_hold+0x62/0x90 [zfs] zap_lockdir+0x4f/0xb0 [zfs] zap_lookup_norm+0x5a/0xc0 [zfs] zap_lookup+0x16/0x20 [zfs] dsl_prop_get_dd+0x192/0x230 [zfs] dsl_prop_get_ds+0x172/0x210 [zfs] ? __list_add+0x17/0x40 [zfs] ? redundant_metadata_changed_cb+0x20/0x20 [zfs] dsl_prop_get_int_ds+0x1e/0x20 [zfs] dsl_prop_register+0x41/0x160 [zfs] dmu_objset_open_impl+0x966/0x9f0 [zfs] dmu_objset_from_ds+0xec/0x180 [zfs] dmu_objset_hold_flags+0x84/0xd0 [zfs] dmu_objset_hold+0x16/0x20 [zfs] zfs_ioc_objset_stats+0x31/0x80 [zfs] zfsdev_ioctl+0x5c8/0x690 [zfs] do_vfs_ioctl+0x407/0x670 ? do_user_addr_fault+0x216/0x450 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x57/0x190 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f450f84b50b Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe94486ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffe94486f10 RCX: 00007f450f84b50b RDX: 00007ffe94486f10 RSI: 0000000000005a12 RDI: 0000000000000003 RBP: 00007ffe94486f00 R08: 00007f450ed80010 R09: 0000000000000000 R10: 0000000000000022 R11: 0000000000000246 R12: 0000561440267150 R13: 00007ffe94486f10 R14: 0000561440267150 R15: 00007ffe9448aa80 Modules linked in: xt_multiport binfmt_misc macvlan nf_tables nfnetlink vhost_vsock vmw_vsock_virtio_transport_common vsock vhost uas usb_storage nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) ipmi_ssif zcommon(PO) znvpair(PO) spl(O) kvm_intel kvm input_leds joydev ipmi_si ipmi_devintf ipmi_msghandler video acpi_pad acpi_tad 8021q garp mrp stp llc nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp sch_fq_codel xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter msr ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid raid1 i2c_algo_bit crct10dif_pclmul drm_vram_helper crc32_pclmul ttm ghash_clmulni_intel drm_kms_helper aesni_intel syscopyarea sysfillrect sysimgblt crypto_simd ixgbe fb_sys_fops cryptd glue_helper nvme xfrm_algo drm dca ahci nvme_core mdio libahci ---[ end trace ec4d2057154f3d6d ]--- RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs] Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f 84 00 01 RSP: 0018:ffffa8bb345676a8 EFLAGS: 00010297 RAX: ffff953bed6cae09 RBX: ffff953bed6ca20a RCX: ffff953bed6ca202 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff953bed6ca207 RBP: ffffa8bb345676d0 R08: ffff953bed6cae05 R09: 0000000000000001 R10: ffff953bed6cae1c R11: ffff953bed6ca3f8 R12: ffff953bed6ca200 R13: ffff953bed6ca400 R14: ffff953bed6cae14 R15: ffff953bed6ca3f7 FS: 00007f450f2577c0(0000) GS:ffff9547ff540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f450ed80000 CR3: 0000000abcce2002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Curious. Does it repeat every time if you receive that same dataset?
Curious. Does it repeat every time if you receive that same dataset?
No. It has been running every 15 minutes for months, and it happened only once so far.
Hello,
I would like to re-launch the discussion here. We have moved forward with our ZFS deployment on Ubuntu 20.04 / 22.04, and we met several tracebacks leading to a freeze of the zpool.
zfs-0.8.3-1ubuntu12.14
zfs-kmod-0.8.3-1ubuntu12.14
general protection fault: 0000 [#1] SMP NOPTI
CPU: 2 PID: 35075 Comm: z_rd_int Tainted: P O 5.4.0-110-generic #124-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
RIP: 0010:fletcher_4_avx512f_native+0x3e/0x90 [zcommon]
Code: 53 48 89 f3 e8 a3 ff ff ff 62 d1 fe 48 6f 45 00 62 d1 fe 48 6f 4d 01 62 d1 fe 48 6f 55 02 62 d1 fe 48 6f 5d 03 4c 39 e3 73 27 <62> f2 7d 48 35 23 62 f1 fd 48 d4 c4 62 f1 f5 48 d4 c8
62 f1 ed 48
RSP: 0018:ffffc12d825f38c8 EFLAGS: 00010087
RAX: 0000000000000000 RBX: 6000e22604d0f000 RCX: ffff9d6bb5bf8c00
RDX: 00000000ffffffff RSI: 6000e22604d0f000 RDI: ffff9d791e294000
RBP: ffffc12d825f38e0 R08: ffffc12d825f39e0 R09: 000001a5940ae000
R10: ffff9d79567a3d18 R11: 0000000000000000 R12: 6000e22604d13000
R13: ffffc12d825f3a00 R14: 6000e22604d0f000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9d795f880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f04badc8024 CR3: 000000029fe68001 CR4: 0000000000760ee0
PKRU: 55555554
Call Trace:
abd_fletcher_4_iter+0x6a/0x90 [zcommon]
? abd_fletcher_4_simd2scalar+0x60/0x60 [zcommon]
abd_iterate_func+0x6b/0xc0 [zfs]
abd_fletcher_4_impl+0x39/0x50 [zfs]
abd_fletcher_4_native+0x52/0x70 [zfs]
? alloc_pages_current+0x87/0xe0
? check_preempt_curr+0x7a/0x90
? ttwu_do_wakeup+0x1e/0x150
zio_checksum_error_impl+0x120/0x6a0 [zfs]
? abd_fletcher_4_impl+0x50/0x50 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
? ktime_get_raw_ts64+0x3e/0xd0
? vdev_queue_class_to_issue+0xc0/0xf0 [zfs]
zio_checksum_error+0x6c/0xd0 [zfs]
zio_checksum_verify+0x3b/0x150 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? zio_wait_for_children+0x94/0xc0 [zfs]
? zio_vdev_io_assess+0x26/0x280 [zfs]
zio_execute+0x93/0xe0 [zfs]
taskq_thread+0x245/0x430 [spl]
? wake_up_q+0x70/0x70
? zio_execute_stack_check.constprop.0+0x10/0x10 [zfs]
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x40
Modules linked in: xt_recent ufs msdos xfs xt_nat veth xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat br_netfilter bridge stp llc aufs overlay zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif kvm_intel input_leds joydev kvm ipmi_si acpi_tad video sch_fq_codel nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea ghash_clmulni_intel sysfillrect sysimgblt
hid_generic raid1 aesni_intel fb_sys_fops crypto_simd usbhid cryptd drm i40e glue_helper ahci hid nvme libahci nvme_core
---[ end trace 32b9c8af5a059aff ]---
RIP: 0010:fletcher_4_avx512f_native+0x3e/0x90 [zcommon]
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1
general protection fault, probably for non-canonical address 0xdead0000000000fc: 0000 [#2] SMP NOPTI
CPU: 4 PID: 3578865 Comm: z_wr_int_0 Tainted: P D O 5.15.0-46-generic #49-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
RIP: 0010:abd_gang_get_offset+0x42/0x90 [zfs]
Code: e8 23 f6 ff ff 48 85 c0 74 36 4d 8b 2c 24 48 89 c6 eb 17 49 29 c5 48 89 df 4d 89 2c 24 e8 36 f6 ff ff 48 89 c6 48 85 c0 74 16 <8b> 46 04 4c 39 e8 76 e1 5b 48 89 f0 41 5c 41 5d 5d c3
cc cc cc cc
RSP: 0018:ffffa82d89e67ba8 EFLAGS: 00010282
RAX: dead0000000000f8 RBX: ffff8b69b2fc0868 RCX: 000000189888d000
RDX: 0000000000000000 RSI: dead0000000000f8 RDI: ffff8b69b2fc0878
RBP: ffffa82d89e67bc0 R08: 00000000000000c0 R09: ffff8b72b73d66d8
R10: 0000000000000000 R11: 0000000000000000 R12: ffffa82d89e67bd0
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8b69f5fcf020
FS: 0000000000000000(0000) GS:ffff8b765fb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055c5e17c7c28 CR3: 0000000106882003 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
abd_nr_pages_off+0x80/0xf0 [zfs]
__vdev_disk_physio.constprop.0+0x1a1/0x300 [zfs]
? __cond_resched+0x1a/0x50
? down_read+0x13/0xa0
? RW_WRITE_HELD.constprop.0+0x9/0x30 [zfs]
vdev_disk_io_start+0x95/0x360 [zfs]
zio_vdev_io_start+0xf2/0x2d0 [zfs]
? vdev_queue_pending_add+0x63/0xa0 [zfs]
zio_execute+0x8f/0xe0 [zfs]
vdev_queue_io_done+0x9c/0x160 [zfs]
zio_vdev_io_done+0x8e/0x1e0 [zfs]
zio_execute+0x8f/0xe0 [zfs]
taskq_thread+0x21e/0x400 [spl]
? wake_up_q+0x90/0x90
? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
? param_set_taskq_kick+0xf0/0xf0 [spl]
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: veth tls xt_nat nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) kvm_intel znvpair(PO) spl(O) kvm joydev input_leds efi_pstore acpi_ipmi ipmi_si video acpi_tad sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper hid_generic syscopyarea sysfillrect sysimgblt crct10dif_pclmul
fb_sys_fops crc32_pclmul usbhid ghash_clmulni_intel cec raid1 hid aesni_intel rc_core nvme crypto_simd ahci xhci_pci drm i40e cryptd libahci nvme_core xhci_pci_renesas
---[ end trace a93c18a4ea079033 ]---
RIP: 0010:fletcher_4_avx512f_native+0x3e/0xa0 [zcommon]
zfs-0.8.3-1ubuntu12.14
zfs-kmod-0.8.3-1ubuntu12.14
BUG: unable to handle page fault for address: ffffffff40b8d4ae
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 44140e067 P4D 44140e067 PUD 0
Oops: 0010 [#1] SMP NOPTI
CPU: 9 PID: 3102353 Comm: postgres Tainted: P O 5.4.0-120-generic #136-Ubuntu
Hardware name: ASUSTeK COMPUTER INC. 3105-S21031001/P11C-M-10G-2T Series, BIOS 3105-S21031001 03/10/2021
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f
84 00 01
RSP: 0018:ffffa2c30c043850 EFLAGS: 00010297
RAX: ffffa2c32d9fb5f5 RBX: ffffa2c37ddd9bab RCX: ffffa2c37ddd9ba3
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa2c37ddd9ba8
RBP: ffffa2c30c043878 R08: ffffa2c32d9fb5d0 R09: 0000000000000001
R10: ffffa2c32da03abc R11: ffffa2c37ddf1ff8 R12: ffffa2c37ddd2000
R13: ffffa2c37ddf2000 R14: ffffa2c32da03ab4 R15: ffffa2c37ddf1ff7
FS: 00007f78618eaec0(0000) GS:ffff8b2e6ea40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff40b8d4ae CR3: 0000000b1efc4006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
zio_decompress_data+0x49/0xb0 [zfs]
arc_buf_fill+0x259/0x4a0 [zfs]
arc_buf_alloc_impl.isra.0+0x161/0x250 [zfs]
arc_read+0xea8/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? atomic_dec_return+0x9/0x10 [zfs]
dbuf_read_impl+0x1d2/0x610 [zfs]
dbuf_read+0xcb/0x5f0 [zfs]
dmu_buf_hold_array_by_dnode+0x113/0x490 [zfs]
dmu_read_uio_dnode+0x49/0xf0 [zfs]
? zrl_add_impl+0x31/0xb0 [zfs]
dmu_read_uio_dbuf+0x47/0x60 [zfs]
zfs_read+0x117/0x300 [zfs]
zpl_read_common_iovec+0x99/0xe0 [zfs]
zpl_iter_read_common+0xa8/0x100 [zfs]
zpl_iter_read+0x58/0x80 [zfs]
new_sync_read+0x122/0x1b0
__vfs_read+0x29/0x40
vfs_read+0xab/0x160
ksys_pread64+0x6d/0xa0
__x64_sys_pread64+0x1e/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f7865170c2a
Code: ff ff ff eb c5 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007fffdc6b0088 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 00007f7827242080 RCX: 00007f7865170c2a
RDX: 0000000000002000 RSI: 00007f7827242080 RDI: 0000000000000012
RBP: 00007fffdc6b00d0 R08: 000000000a00000d R09: 0000000000000000
R10: 000000000002c000 R11: 0000000000000246 R12: 00007f785c641fb0
R13: 000000000002c000 R14: 0000000000002000 R15: 0000000000000001
Modules linked in: ufs msdos xfs xt_recent zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif kvm_intel kvm joydev input_leds ipmi_si video acpi_tad sch_fq_codel nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper ttm hid_generic drm_kms_helper raid1 ixgbe syscopyarea crct10dif_pclmul sysfillrect crc32_pclmul sysimgblt usbhid ghash_clmulni_intel fb_sys_fops aesni_intel uas crypto_simd xfrm_algo cryptd drm hid nvme glue_helper ahci dca usb_storage
mdio libahci nvme_core
CR2: ffffffff40b8d4ae
---[ end trace aa592810eb96689b ]---
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f 84 00 01
RSP: 0018:ffffa2c30c043850 EFLAGS: 00010297
RAX: ffffa2c32d9fb5f5 RBX: ffffa2c37ddd9bab RCX: ffffa2c37ddd9ba3
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa2c37ddd9ba8
RBP: ffffa2c30c043878 R08: ffffa2c32d9fb5d0 R09: 0000000000000001
R10: ffffa2c32da03abc R11: ffffa2c37ddf1ff8 R12: ffffa2c37ddd2000
R13: ffffa2c37ddf2000 R14: ffffa2c32da03ab4 R15: ffffa2c37ddf1ff7
FS: 00007f78618eaec0(0000) GS:ffff8b2e6ea40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff40b8d4ae CR3: 0000000b1efc4006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
In the end, I don't know how / if these are related to #11679. We use an encrypted dataset, but we do not use the zfs send / receive feature.
Although these crashes are uncommon, the increased number of machines involved make it look appear more and more frequently.
zfs-0.8.3-1ubuntu12.14
zfs-kmod-0.8.3-1ubuntu12.14
BUG: unable to handle page fault for address: ffffffff001e0426
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 53900e067 P4D 53900e067 PUD 0
Oops: 0010 [#1] SMP NOPTI
CPU: 13 PID: 301321 Comm: postgres Tainted: P O 5.4.0-120-generic #136-Ubuntu
Hardware name: ASUSTeK COMPUTER INC. 3105-S21031001/P11C-M-10G-2T Series, BIOS 3105-S21031001 03/10/2021
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
Code: 57 41 56 41 55 4c 8d 2c 0e 41 54 53 4c 39 d7 0f 83 32 02 00 00 48 89 f8 49 89 f4 4d 8d 5d f8 48 89 f7 4d 8d 72 f8 4d 8d 7d f7 <4c> 8d 40 01 0f b6 00 41 89 c1 c1 e8 04 89 c2 83 f8 0f 0f
84 00 01
RSP: 0018:ffffb68e8c153850 EFLAGS: 00010297
RAX: ffffb68ef47dd697 RBX: ffffb68f0051df6c RCX: ffffb68f0051df64
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffb68f0051df68
RBP: ffffb68e8c153878 R08: ffffb68ef47dd692 R09: 0000000000000000
R10: ffffb68ef47e67eb R11: ffffb68f00535ff8 R12: ffffb68f00516000
R13: ffffb68f00536000 R14: ffffb68ef47e67e3 R15: ffffb68f00535ff7
FS: 00007fadf68caec0(0000) GS:ffff933aeeb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff001e0426 CR3: 0000000465702001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lz4_decompress_zfs+0x2b/0x30 [zfs]
zio_decompress_data_buf+0x3c/0x70 [zfs]
zio_decompress_data+0x49/0xb0 [zfs]
arc_buf_fill+0x259/0x4a0 [zfs]
arc_buf_alloc_impl.isra.0+0x161/0x250 [zfs]
arc_read+0xea8/0x1100 [zfs]
? dbuf_rele_and_unlock+0x4e0/0x4e0 [zfs]
? atomic_dec_return+0x9/0x10 [zfs]
dbuf_read_impl+0x1d2/0x610 [zfs]
dbuf_read+0xcb/0x5f0 [zfs]
dmu_buf_hold_array_by_dnode+0x113/0x490 [zfs]
dmu_read_uio_dnode+0x49/0xf0 [zfs]
? zrl_add_impl+0x31/0xb0 [zfs]
dmu_read_uio_dbuf+0x47/0x60 [zfs]
zfs_read+0x117/0x300 [zfs]
zpl_read_common_iovec+0x99/0xe0 [zfs]
zpl_iter_read_common+0xa8/0x100 [zfs]
zpl_iter_read+0x58/0x80 [zfs]
new_sync_read+0x122/0x1b0
__vfs_read+0x29/0x40
vfs_read+0xab/0x160
ksys_pread64+0x6d/0xa0
__x64_sys_pread64+0x1e/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fadfa150bca
Code: ff ff ff eb c5 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffe4bdc6868 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
RAX: ffffffffffffffda RBX: 00007fadeaf22080 RCX: 00007fadfa150bca
RDX: 0000000000002000 RSI: 00007fadeaf22080 RDI: 0000000000000011
RBP: 00007ffe4bdc68b0 R08: 000000000a00000d R09: 0000000000000000
R10: 0000000000030000 R11: 0000000000000246 R12: 00007fadf164e7e0
R13: 0000000000030000 R14: 0000000000002000 R15: 0000000000000001
Modules linked in: ufs msdos xfs xt_multiport isofs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) ipmi_ssif zcommon(PO) znvpair(PO) spl(O) kvm_intel kvm joydev input_leds ipmi_si video acpi_tad sch_fq_codel nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper crct10dif_pclmul ttm crc32_pclmul ghash_clmulni_intel hid_generic raid1 ixgbe drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt usbhid uas xfrm_algo fb_sys_fops crypto_simd cryptd usb_storage hid ahci drm glue_helper dca nvme
mdio libahci nvme_core
CR2: ffffffff001e0426
---[ end trace 0be7c89bce6193b6 ]---
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x3e/0x270 [zfs]
This one occurs the most frequently. Only reference I can find is https://github.com/openzfs/zfs/issues/10401
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 15 PID: 15851 Comm: z_wr_int_1 Tainted: P O 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
RIP: 0010:abd_is_gang+0x0/0x10 [zfs]
Code: 00 00 c3 cc cc cc cc 0f 1f 00 8b 07 83 e0 01 c3 cc cc cc cc 66 0f 1f 44 00 00 8b 07 c1 e8 05 83 e0 01 c3 cc cc cc cc 0f 1f 00 <8b> 07 c1 e8 06 83 e0 01 c3 cc cc cc cc 0f 1f 00 0f 1f
44 00 00 55
RSP: 0018:ffffa17b821cfb80 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 0000000000004000 RCX: ffffffffc0bb0470
RDX: 0000000000004000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffa17b821cfc00 R08: ffffa17b821cfc10 R09: 0000000000000002
R10: 0000000000000001 R11: ffff8bec5d21cda8 R12: 0000000000000000
R13: ffff8be3c5834000 R14: ffffa17b821cfc10 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8bf25fdc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000006f0810005 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? abd_iterate_func+0x6b/0x100 [zfs]
? abd_advance_abd_iter+0x90/0x90 [zfs]
abd_copy_to_buf_off+0x3c/0x60 [zfs]
abd_copy_to_buf+0xe/0x20 [zfs]
arc_buf_fill+0x1a6/0x4d0 [zfs]
arc_buf_alloc_impl+0x1a2/0x230 [zfs]
arc_read_done+0x160/0x540 [zfs]
zio_done+0x3aa/0xe60 [zfs]
zio_execute+0x8f/0xe0 [zfs]
taskq_thread+0x21e/0x400 [spl]
? wake_up_q+0x90/0x90
? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
? param_set_taskq_kick+0xf0/0xf0 [spl]
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay tls zfs(PO) zunicode(PO) zzstd(O) zlua(O) zav
l(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_def
rag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif kvm_intel kvm joydev input_leds acpi_ipmi ipmi_si video acpi_tad dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel ipmi_devintf ipmi_msghandler msr
efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper hid_generic
drm_ttm_helper ttm drm_kms_helper usbhid crct10dif_pclmul syscopyarea crc32_pclmul
raid1 sysfillrect ghash_clmulni_intel hid sysimgblt fb_sys_fops aesni_intel cec crypto_simd nvme rc_core ahci xhci_pci drm cryptd i40e nvme_core libahci xhci_pci_renesas
CR2: 0000000000000000
---[ end trace 3b74dfdcc49b02ac ]---
RIP: 0010:abd_is_gang+0x0/0x10 [zfs]
This one didn't cause the system to hang, apparently led to a corrupted file. Occurrences: 3
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1
VERIFY3(0 == remove_reference(hdr, NULL, tag)) failed (0 == 2)
VERIFY3(0 == remove_reference(hdr, NULL, tag)) failed (0 == 1)
PANIC at arc.c:3844:arc_buf_destroy()
PANIC at arc.c:3844:arc_buf_destroy()
Showing stack for process 1162989
CPU: 3 PID: 1162989 Comm: python3 Tainted: P O 5.15.0-48-generic #54-Ubuntu
Showing stack for process 11026
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
Call Trace:
<TASK>
show_stack+0x52/0x5c
dump_stack_lvl+0x4a/0x63
dump_stack+0x10/0x16
spl_dumpstack+0x29/0x2f [spl]
spl_panic+0xd1/0xe9 [spl]
? zfs_ereport_post+0xd7/0x1a0 [zfs]
? arc_untransform+0x74/0x90 [zfs]
? spl_kmem_cache_free+0xcd/0x140 [spl]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? aggsum_add+0x1b8/0x1d0 [zfs]
arc_buf_destroy+0xed/0xf0 [zfs]
dbuf_destroy+0x31/0x3d0 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
dbuf_rele_and_unlock+0x98/0x540 [zfs]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? __cond_resched+0x1a/0x50
? down_read+0x13/0xa0
? wmsum_add+0xe/0x20 [zfs]
? dbuf_read+0x1e6/0x5b0 [zfs]
dnode_rele_and_unlock+0xd2/0xf0 [zfs]
dnode_rele+0x39/0x50 [zfs]
dmu_bonus_hold+0x66/0x90 [zfs]
sa_buf_hold+0xe/0x20 [zfs]
zfs_zget+0x5b/0x280 [zfs]
? zfs_match_find.constprop.0+0x76/0x100 [zfs]
zfs_dirent_lock+0x424/0x570 [zfs]
zfs_dirlook+0x91/0x2d0 [zfs]
zfs_lookup+0x25e/0x410 [zfs]
zpl_lookup+0xcf/0x230 [zfs]
? get_acl+0x18/0x50
__lookup_slow+0x85/0x150
walk_component+0x145/0x1c0
? path_init+0x2c0/0x3f0
path_lookupat+0x6e/0x1c0
? avl_find+0x68/0xd0 [zavl]
filename_lookup+0xcf/0x1d0
? __check_object_size+0x1d/0x30
? strncpy_from_user+0x44/0x150
? getname_flags.part.0+0x4c/0x1b0
user_path_at_empty+0x3f/0x60
vfs_statx+0x7a/0x130
__do_sys_newfstatat+0x36/0x70
? __secure_computing+0xa9/0x120
? syscall_trace_enter.constprop.0+0xa7/0x1c0
__x64_sys_newfstatat+0x1c/0x30
do_syscall_64+0x59/0xc0
? exit_to_user_mode_loop+0x10d/0x160
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? __x64_sys_close+0x11/0x50
? do_syscall_64+0x69/0xc0
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x69/0xc0
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f1937dfeeee
Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 09
RSP: 002b:00007fff14cf79a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
RAX: ffffffffffffffda RBX: 000055e201b98a40 RCX: 00007f1937dfeeee
RDX: 00007fff14cf7a40 RSI: 00007f1937015860 RDI: 00000000ffffff9c
RBP: 000055e201e2c340 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff14cf7a40
R13: 00007f1937c87f10 R14: 00000000ffffff9c R15: 000055e201b98bf0
</TASK>
CPU: 2 PID: 11026 Comm: z_wr_int_2 Tainted: P O 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
Call Trace:
<TASK>
show_stack+0x52/0x5c
dump_stack_lvl+0x4a/0x63
dump_stack+0x10/0x16
spl_dumpstack+0x29/0x2f [spl]
spl_panic+0xd1/0xe9 [spl]
? spl_kmem_cache_free+0xcd/0x140 [spl]
? kmem_cache_free+0x26c/0x290
? percpu_counter_add+0xf/0x20 [spl]
? percpu_counter_dec+0x10/0x20 [spl]
? spl_kmem_free_impl+0x29/0x40 [spl]
arc_buf_destroy+0xed/0xf0 [zfs]
dbuf_issue_final_prefetch_done+0x30/0x40 [zfs]
arc_read_done+0x247/0x540 [zfs]
zio_done+0x3aa/0xe60 [zfs]
zio_execute+0x8f/0xe0 [zfs]
taskq_thread+0x21e/0x400 [spl]
? wake_up_q+0x90/0x90
? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
? param_set_taskq_kick+0xf0/0xf0 [spl]
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
Few seconds later:
INFO: task z_wr_int_2:11026 blocked for more than 241 seconds.
Tainted: P O 5.15.0-48-generic #54-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:z_wr_int_2 state:D stack: 0 pid:11026 ppid: 2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x23d/0x5a0
schedule+0x4e/0xc0
spl_panic+0xe7/0xe9 [spl]
? spl_kmem_cache_free+0xcd/0x140 [spl]
? kmem_cache_free+0x26c/0x290
? percpu_counter_add+0xf/0x20 [spl]
? percpu_counter_dec+0x10/0x20 [spl]
? spl_kmem_free_impl+0x29/0x40 [spl]
arc_buf_destroy+0xed/0xf0 [zfs]
dbuf_issue_final_prefetch_done+0x30/0x40 [zfs]
arc_read_done+0x247/0x540 [zfs]
zio_done+0x3aa/0xe60 [zfs]
zio_execute+0x8f/0xe0 [zfs]
taskq_thread+0x21e/0x400 [spl]
? wake_up_q+0x90/0x90
? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
? param_set_taskq_kick+0xf0/0xf0 [spl]
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
INFO: task python3:1162989 blocked for more than 241 seconds.
Tainted: P O 5.15.0-48-generic #54-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:python3 state:D stack: 0 pid:1162989 ppid:1162987 flags:0x00004222
Call Trace:
<TASK>
__schedule+0x23d/0x5a0
schedule+0x4e/0xc0
spl_panic+0xe7/0xe9 [spl]
? zfs_ereport_post+0xd7/0x1a0 [zfs]
? arc_untransform+0x74/0x90 [zfs]
? spl_kmem_cache_free+0xcd/0x140 [spl]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? aggsum_add+0x1b8/0x1d0 [zfs]
arc_buf_destroy+0xed/0xf0 [zfs]
dbuf_destroy+0x31/0x3d0 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
dbuf_rele_and_unlock+0x98/0x540 [zfs]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? __cond_resched+0x1a/0x50
? down_read+0x13/0xa0
? wmsum_add+0xe/0x20 [zfs]
? dbuf_read+0x1e6/0x5b0 [zfs]
dnode_rele_and_unlock+0xd2/0xf0 [zfs]
dnode_rele+0x39/0x50 [zfs]
dmu_bonus_hold+0x66/0x90 [zfs]
sa_buf_hold+0xe/0x20 [zfs]
zfs_zget+0x5b/0x280 [zfs]
? zfs_match_find.constprop.0+0x76/0x100 [zfs]
zfs_dirent_lock+0x424/0x570 [zfs]
zfs_dirlook+0x91/0x2d0 [zfs]
zfs_lookup+0x25e/0x410 [zfs]
zpl_lookup+0xcf/0x230 [zfs]
? get_acl+0x18/0x50
__lookup_slow+0x85/0x150
walk_component+0x145/0x1c0
? path_init+0x2c0/0x3f0
path_lookupat+0x6e/0x1c0
? avl_find+0x68/0xd0 [zavl]
filename_lookup+0xcf/0x1d0
? __check_object_size+0x1d/0x30
? strncpy_from_user+0x44/0x150
? getname_flags.part.0+0x4c/0x1b0
user_path_at_empty+0x3f/0x60
vfs_statx+0x7a/0x130
__do_sys_newfstatat+0x36/0x70
? __secure_computing+0xa9/0x120
? syscall_trace_enter.constprop.0+0xa7/0x1c0
__x64_sys_newfstatat+0x1c/0x30
do_syscall_64+0x59/0xc0
? exit_to_user_mode_loop+0x10d/0x160
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? __x64_sys_close+0x11/0x50
? do_syscall_64+0x69/0xc0
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x69/0xc0
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f1937dfeeee
RSP: 002b:00007fff14cf79a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
RAX: ffffffffffffffda RBX: 000055e201b98a40 RCX: 00007f1937dfeeee
RDX: 00007fff14cf7a40 RSI: 00007f1937015860 RDI: 00000000ffffff9c
RBP: 000055e201e2c340 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff14cf7a40
R13: 00007f1937c87f10 R14: 00000000ffffff9c R15: 000055e201b98bf0
</TASK>
Occurrences: 1
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1
RSP: 0018:ffffb551cf997528 EFLAGS: 00010282
RAX: e224efda58e958d5 RBX: 0000000000004000 RCX: e224efda58e978d5
RDX: 0000000000363148 RSI: 0000000000042c20 RDI: 000039e21fc12140
RBP: ffffb551cf997568 R08: ffffd551bf6d2140 R09: 0000000000000002
R10: ffffb551cf997738 R11: 0000000000000000 R12: ffff9b608493e000
R13: 0000000000000000 R14: 0000000000042c20 R15: 0000000000042c20
FS: 00007f6314e8b740(0000) GS:ffff9b6f9fac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f630f8a5024 CR3: 000000041646a004 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? spl_kmem_cache_alloc+0x3b/0x100 [spl]
spl_kmem_cache_alloc+0x3b/0x100 [spl]
zio_buf_alloc+0x30/0x80 [zfs]
arc_get_data_buf+0x48/0x60 [zfs]
arc_buf_alloc_impl+0xc9/0x230 [zfs]
arc_read+0x106b/0x15c0 [zfs]
? dbuf_rele_and_unlock+0x540/0x540 [zfs]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
dbuf_read_impl.constprop.0+0x2f2/0x490 [zfs]
dbuf_read+0x1ba/0x5b0 [zfs]
dnode_hold_impl+0x113/0xb00 [zfs]
? __cond_resched+0x1a/0x50
? mutex_lock+0x13/0x50
? zfs_znode_hold_enter+0x123/0x170 [zfs]
dnode_hold+0x1b/0x30 [zfs]
dmu_bonus_hold+0x38/0x90 [zfs]
sa_buf_hold+0xe/0x20 [zfs]
zfs_zget+0x5b/0x280 [zfs]
? zfs_match_find.constprop.0+0x76/0x100 [zfs]
zfs_dirent_lock+0x424/0x570 [zfs]
zfs_dirlook+0x91/0x2d0 [zfs]
zfs_lookup+0x25e/0x410 [zfs]
zpl_lookup+0xcf/0x230 [zfs]
lookup_open.isra.0+0x2c1/0x6f0
open_last_lookups+0x198/0x3f0
? path_init+0x2c0/0x3f0
path_openat+0x8d/0x2b0
? lru_cache_add_inactive_or_unevictable+0x2e/0xe0
? page_add_new_anon_rmap+0x69/0x100
do_filp_open+0xb2/0x160
? __check_object_size+0x1d/0x30
? alloc_fd+0x53/0x180
do_sys_openat2+0x9f/0x160
__x64_sys_openat+0x55/0x90
do_syscall_64+0x59/0xc0
? irqentry_exit+0x1d/0x30
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f631516a4fc
Code: 24 18 31 c0 41 83 e2 40 75 44 89 f0 25 00 00 41 00 3d 00 00 41 00 74 36 44 89 c2 4c 89 ce bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 44 48 8b 4c 24 18 64 48 33 0c 25 28 00 00 00
RSP: 002b:00007ffcc5ba4050 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007f6314db6770 RCX: 00007f631516a4fc
RDX: 0000000000090800 RSI: 00007f630f8a5ad0 RDI: 00000000ffffff9c
RBP: 00000000018092f0 R08: 0000000000090800 R09: 00007f630f8a5ad0
R10: 0000000000000000 R11: 0000000000000287 R12: 00007f630f8a3500
R13: 00007f630f8a5ad0 R14: 0000000002304db0 R15: 00007f6314db6770
</TASK>
Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay tls zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif kvm_intel kvm joydev input_leds acpi_ipmi ipmi_si video acpi_tad dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel ipmi_devintf ipmi_msghandler msr efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops crct10dif_pclmul crc32_pclmul cec ghash_clmulni_intel raid1 rc_core aesni_intel nvme ahci crypto_simd xhci_pci i40e cryptd drm libahci nvme_core xhci_pci_renesas
---[ end trace 8be0eb0694452b25 ]---
RIP: 0010:kmem_cache_alloc+0xfd/0x2f0
Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 96 01 00 00 48 85 c0 0f 84 8d 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
RSP: 0018:ffffb551e7303770 EFLAGS: 00010282
RAX: e224efda58e958d5 RBX: 0000000000004000 RCX: e224efda58e978d5
RDX: 0000000000363148 RSI: 0000000000042c20 RDI: 000039e21fc12140
RBP: ffffb551e73037b0 R08: ffffd551bf6d2140 R09: 0000000000000001
R10: ffff9b67b825d420 R11: 0000000000000000 R12: ffff9b608493e000
R13: 0000000000000000 R14: 0000000000042c20 R15: 0000000000042c20
FS: 00007fab7b881a40(0000) GS:ffff9b6f9fac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fab39f27000 CR3: 0000000a1bc12003 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? spl_kmem_cache_alloc+0x3b/0x100 [spl]
spl_kmem_cache_alloc+0x3b/0x100 [spl]
zio_data_buf_alloc+0x30/0x80 [zfs]
arc_get_data_buf+0x38/0x60 [zfs]
arc_buf_alloc_impl+0xc9/0x230 [zfs]
arc_alloc_buf+0x74/0xe0 [zfs]
dbuf_fix_old_data.constprop.0+0x289/0x2a0 [zfs]
? spl_kmem_alloc_impl+0x80/0xd0 [spl]
dbuf_dirty+0x5fe/0x6e0 [zfs]
? dbuf_read+0x366/0x5b0 [zfs]
dmu_buf_will_dirty_impl+0xbb/0x120 [zfs]
dmu_buf_will_dirty+0x16/0x20 [zfs]
dbuf_new_size+0x64/0x1a0 [zfs]
dnode_set_blksz+0x2a5/0x2d0 [zfs]
dmu_object_set_blocksize+0x56/0xa0 [zfs]
zfs_grow_blocksize+0x5a/0xa0 [zfs]
zfs_write+0x8f7/0xdb0 [zfs]
? shmem_getpage_gfp+0xcf/0x860
? generic_write_checks+0x65/0xc0
zpl_iter_write+0xe7/0x130 [zfs]
new_sync_write+0x114/0x1b0
vfs_write+0x1d5/0x270
__x64_sys_pwrite64+0x96/0xc0
do_syscall_64+0x59/0xc0
? handle_mm_fault+0xd8/0x2c0
? do_user_addr_fault+0x1e7/0x670
? exit_to_user_mode_prepare+0x37/0xb0
? irqentry_exit_to_user_mode+0x9/0x20
? irqentry_exit+0x1d/0x30
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7fab7e2429aa
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb ba 0f 1f 00 f3 0f 1e fa 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 12 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffdee58bc48 EFLAGS: 00000246 ORIG_RAX: 0000000000000012
RAX: ffffffffffffffda RBX: 0000558b75fcfb68 RCX: 00007fab7e2429aa
RDX: 0000000000002000 RSI: 00007fab39f26200 RDI: 00000000000000fa
RBP: 00007ffdee58bca0 R08: 000000000a00000b R09: 0000558b75fccde0
R10: 0000000000004000 R11: 0000000000000246 R12: 0000000000004000
R13: 00007fab39f26200 R14: 00007fab7b881940 R15: 0000000000002000
</TASK>
Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay tls zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif kvm_intel kvm joydev input_leds acpi_ipmi ipmi_si video acpi_tad dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel ipmi_devintf ipmi_msghandler msr efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops
crct10dif_pclmul crc32_pclmul cec ghash_clmulni_intel raid1 rc_core aesni_intel nvme ahci crypto_simd xhci_pci i40e cryptd drm libahci nvme_core xhci_pci_renesas
---[ end trace 8be0eb0694452b26 ]---
RIP: 0010:kmem_cache_alloc+0xfd/0x2f0
Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 96 01 00 00 48 85 c0 0f 84 8d 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
RSP: 0018:ffffb551cf997528 EFLAGS: 00010282
RAX: e224efda58e958d5 RBX: 0000000000004000 RCX: e224efda58e978d5
RDX: 0000000000363148 RSI: 0000000000042c20 RDI: 000039e21fc12140
RBP: ffffb551cf997568 R08: ffffd551bf6d2140 R09: 0000000000000002
R10: ffffb551cf997738 R11: 0000000000000000 R12: ffff9b608493e000
R13: 0000000000000000 R14: 0000000000042c20 R15: 0000000000042c20
FS: 00007fab7b881a40(0000) GS:ffff9b6f9fac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fab39f27000 CR3: 0000000a1bc12003 CR4: 0000000000770ee0
PKRU: 55555554
general protection fault, probably for non-canonical address 0xe224efda58e978d5: 0000 [#3] SMP NOPTI
CPU: 3 PID: 55587 Comm: python3 Tainted: P D O 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
RIP: 0010:kmem_cache_alloc+0xfd/0x2f0
Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 96 01 00 00 48 85 c0 0f 84 8d 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
RSP: 0018:ffffb551c2993420 EFLAGS: 00010282
RAX: e224efda58e958d5 RBX: 0000000000004000 RCX: e224efda58e978d5
RDX: 0000000000363148 RSI: 0000000000042c20 RDI: 000039e21fc12140
RBP: ffffb551c2993460 R08: ffffd551bf6d2140 R09: 0000000000000002
R10: ffffb551c2993630 R11: 0000000000000000 R12: ffff9b608493e000
R13: 0000000000000000 R14: 0000000000042c20 R15: 0000000000042c20
FS: 00007fc8c09ff640(0000) GS:ffff9b6f9fac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fab7e1b75a0 CR3: 00000003670d6001 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? spl_kmem_cache_alloc+0x3b/0x100 [spl]
spl_kmem_cache_alloc+0x3b/0x100 [spl]
zio_buf_alloc+0x30/0x80 [zfs]
arc_get_data_buf+0x48/0x60 [zfs]
arc_buf_alloc_impl+0xc9/0x230 [zfs]
arc_read+0x106b/0x15c0 [zfs]
? dbuf_rele_and_unlock+0x540/0x540 [zfs]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
dbuf_read_impl.constprop.0+0x2f2/0x490 [zfs]
dbuf_read+0x1ba/0x5b0 [zfs]
dnode_hold_impl+0x113/0xb00 [zfs]
? __cond_resched+0x1a/0x50
? mutex_lock+0x13/0x50
? zfs_znode_hold_enter+0x123/0x170 [zfs]
dnode_hold+0x1b/0x30 [zfs]
dmu_bonus_hold+0x38/0x90 [zfs]
sa_buf_hold+0xe/0x20 [zfs]
zfs_zget+0x5b/0x280 [zfs]
? zfs_match_find.constprop.0+0x76/0x100 [zfs]
zfs_dirent_lock+0x424/0x570 [zfs]
zfs_dirlook+0x91/0x2d0 [zfs]
zfs_lookup+0x25e/0x410 [zfs]
zpl_lookup+0xcf/0x230 [zfs]
__lookup_slow+0x85/0x150
walk_component+0x145/0x1c0
link_path_walk.part.0.constprop.0+0x23f/0x3a0
? path_init+0x2c0/0x3f0
path_lookupat+0x3e/0x1c0
filename_lookup+0xcf/0x1d0
? __check_object_size+0x1d/0x30
? strncpy_from_user+0x44/0x150
? getname_flags.part.0+0x4c/0x1b0
user_path_at_empty+0x3f/0x60
vfs_statx+0x7a/0x130
__do_sys_newfstatat+0x36/0x70
? __secure_computing+0xa9/0x120
? syscall_trace_enter.constprop.0+0xa7/0x1c0
__x64_sys_newfstatat+0x1c/0x30
do_syscall_64+0x59/0xc0
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? __x64_sys_write+0x19/0x20
? do_syscall_64+0x69/0xc0
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7fc8cbf13eee
Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00
48 8b 15 09
RSP: 002b:00007fc8c09fbdb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
RAX: ffffffffffffffda RBX: 00005588a8277a40 RCX: 00007fc8cbf13eee
RDX: 00007fc8c09fbe50 RSI: 00007fc8baf3dc70 RDI: 00000000ffffff9c
RBP: 00005588a9e1c430 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc8c09fbe50
R13: 00007fc8cbd9bfb0 R14: 00000000ffffff9c R15: 00005588a8277bf0
</TASK>
Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay tls zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif kvm_intel kvm joydev input_leds acpi_ipmi ipmi_si video acpi_tad dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel ipmi_devintf ipmi_msghandler msr efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear hid_generic usbhid hid ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops crct10dif_pclmul crc32_pclmul cec ghash_clmulni_intel raid1 rc_core aesni_intel nvme ahci crypto_simd xhci_pci i40e cryptd drm libahci nvme_core xhci_pci_renesas
---[ end trace 8be0eb0694452b27 ]---
RIP: 0010:kmem_cache_alloc+0xfd/0x2f0
Code: 8b 50 08 49 8b 00 49 83 78 10 00 48 89 45 c8 0f 84 96 01 00 00 48 85 c0 0f 84 8d 01 00 00 41 8b 4c 24 28 49 8b 3c 24 48 01 c1 <48> 8b 19 48 89 ce 49 33 9c 24 b8 00 00 00 48 8d 4a 01 48 0f ce 48
RSP: 0018:ffffb551cf997528 EFLAGS: 00010282
RAX: e224efda58e958d5 RBX: 0000000000004000 RCX: e224efda58e978d5
RDX: 0000000000363148 RSI: 0000000000042c20 RDI: 000039e21fc12140
RBP: ffffb551cf997568 R08: ffffd551bf6d2140 R09: 0000000000000002
R10: ffffb551cf997738 R11: 0000000000000000 R12: ffff9b608493e000
R13: 0000000000000000 R14: 0000000000042c20 R15: 0000000000042c20
FS: 00007fc8c09ff640(0000) GS:ffff9b6f9fac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fab7e1b75a0 CR3: 00000003670d6001 CR4: 0000000000770ee0
PKRU: 55555554
Moreover:
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
Happens more and more frequently since layout change.
zfs-0.8.3-1ubuntu12.12
zfs-kmod-0.8.3-1ubuntu12.14
INFO: task spl_delay_taskq:936 blocked for more than 120 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 936 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
? enqueue_entity+0x128/0x660
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? __switch_to_asm+0x40/0x70
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task saas_exporter:381942 blocked for more than 121 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
saas_exporter D 0 381942 381939 0x00004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
taskq_wait_id+0x6e/0xb0 [spl]
? __wake_up_pollfree+0x40/0x40
taskq_cancel_id+0xd8/0x100 [spl]
zfsctl_snapshot_unmount_cancel+0x1e/0x40 [zfs]
zfsctl_snapshot_unmount_delay+0x48/0xa0 [zfs]
zpl_revalidate+0xbd/0x120 [zfs]
lookup_fast+0x281/0x300
walk_component+0x48/0x360
? link_path_walk.part.0+0x2a2/0x550
path_lookupat.isra.0+0x80/0x230
? atomic_dec+0xd/0x20 [spl]
filename_lookup+0xae/0x170
? __check_object_size+0x13f/0x150
? strncpy_from_user+0x4c/0x150
user_path_at_empty+0x3a/0x50
vfs_statx+0x7d/0xe0
__do_sys_newlstat+0x3e/0x80
__x64_sys_newlstat+0x16/0x20
do_syscall_64+0x57/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f0801bb45aa
Code: Bad RIP value.
RSP: 002b:00007ffd1e07bf78 EFLAGS: 00000246 ORIG_RAX: 0000000000000006
RAX: ffffffffffffffda RBX: 00007f07fed6b7c0 RCX: 00007f0801bb45aa
RDX: 00007ffd1e07bff0 RSI: 00007ffd1e07bff0 RDI: 00007f07fd54b110
RBP: 00000000ffffff9c R08: 0000000000000001 R09: 0000000000000038
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000001a71600
R13: 00007ffd1e07bff0 R14: 00007f07fed6b7c0 R15: 0000000001d421f8
INFO: task umount:381977 blocked for more than 121 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount D 0 381977 381866 0x00004000
Call Trace:
__schedule+0x2e3/0x740
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
schedule+0x42/0xb0
rwsem_down_write_slowpath+0x244/0x4d0
? __radix_tree_delete+0x8f/0xf0
down_write+0x41/0x50
zfsctl_destroy+0x5a/0xc0 [zfs]
zfs_preumount+0x2f/0x70 [zfs]
zpl_kill_sb+0x13/0x20 [zfs]
deactivate_locked_super+0x3b/0x80
deactivate_super+0x3e/0x50
cleanup_mnt+0x109/0x160
__cleanup_mnt+0x12/0x20
task_work_run+0x8f/0xb0
exit_to_usermode_loop+0x131/0x160
do_syscall_64+0x163/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff72980d19b
Code: Bad RIP value.
RSP: 002b:00007ffdc55795b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007ff72993f204 RCX: 00007ff72980d19b
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055b3f873ae30
RBP: 000055b3f8720960 R08: 0000000000000000 R09: 00007ffdc5578360
R10: 00007ff72992b379 R11: 0000000000000246 R12: 000055b3f873ae30
R13: 0000000000000000 R14: 000055b3f8720a58 R15: 000055b3f8720b70
INFO: task spl_delay_taskq:381978 blocked for more than 121 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381978 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381979 blocked for more than 121 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381979 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
rwsem_down_read_slowpath+0x16c/0x4a0
? autoremove_wake_function+0x12/0x40
down_read+0x85/0xa0
zfsctl_snapshot_unmount+0x72/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381981 blocked for more than 122 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381981 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381984 blocked for more than 122 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381984 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
? wait_for_completion+0xbc/0x120
? wake_up_q+0x70/0x70
schedule+0x42/0xb0
rwsem_down_read_slowpath+0x16c/0x4a0
down_read+0x85/0xa0
snapentry_expire+0x51/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381986 blocked for more than 122 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381986 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? __switch_to_asm+0x40/0x70
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381989 blocked for more than 122 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381989 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? __switch_to_asm+0x40/0x70
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
INFO: task spl_delay_taskq:381992 blocked for more than 123 seconds.
Tainted: P O 5.4.0-126-generic #142-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
spl_delay_taskq D 0 381992 2 0x80004000
Call Trace:
__schedule+0x2e3/0x740
schedule+0x42/0xb0
schedule_timeout+0x10e/0x160
wait_for_completion+0xb1/0x120
? wake_up_q+0x70/0x70
call_usermodehelper_exec+0x148/0x180
call_usermodehelper+0x93/0xb0
zfsctl_snapshot_unmount+0xc8/0x190 [zfs]
snapentry_expire+0x3d/0xc0 [zfs]
taskq_thread+0x245/0x430 [spl]
? __switch_to_asm+0x40/0x70
? wake_up_q+0x70/0x70
kthread+0x104/0x140
? task_done+0x90/0x90 [spl]
? kthread_park+0x90/0x90
ret_from_fork+0x35/0x40
Incident 10
This one didn't cause the system to hang, apparently led to a corrupted file.
zfs-2.1.4-0ubuntu0.1 zfs-kmod-2.1.4-0ubuntu0.1
VERIFY3(0 == remove_reference(hdr, NULL, tag)) failed (0 == 2) VERIFY3(0 == remove_reference(hdr, NULL, tag)) failed (0 == 1) PANIC at arc.c:3844:arc_buf_destroy() PANIC at arc.c:3844:arc_buf_destroy() Showing stack for process 1162989 CPU: 3 PID: 1162989 Comm: python3 Tainted: P O 5.15.0-48-generic #54-Ubuntu Showing stack for process 11026 Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022 Call Trace: <TASK> show_stack+0x52/0x5c dump_stack_lvl+0x4a/0x63 dump_stack+0x10/0x16 spl_dumpstack+0x29/0x2f [spl] spl_panic+0xd1/0xe9 [spl] ? zfs_ereport_post+0xd7/0x1a0 [zfs] ? arc_untransform+0x74/0x90 [zfs] ? spl_kmem_cache_free+0xcd/0x140 [spl] ? __cond_resched+0x1a/0x50 ? do_raw_spin_unlock+0x9/0x10 [zfs] ? __raw_spin_unlock+0x9/0x10 [zfs] ? aggsum_add+0x1b8/0x1d0 [zfs] arc_buf_destroy+0xed/0xf0 [zfs] dbuf_destroy+0x31/0x3d0 [zfs] ? __raw_spin_unlock+0x9/0x10 [zfs] dbuf_rele_and_unlock+0x98/0x540 [zfs] ? __cond_resched+0x1a/0x50 ? do_raw_spin_unlock+0x9/0x10 [zfs] ? __raw_spin_unlock+0x9/0x10 [zfs] ? __cond_resched+0x1a/0x50 ? down_read+0x13/0xa0 ? wmsum_add+0xe/0x20 [zfs] ? dbuf_read+0x1e6/0x5b0 [zfs] dnode_rele_and_unlock+0xd2/0xf0 [zfs] dnode_rele+0x39/0x50 [zfs] dmu_bonus_hold+0x66/0x90 [zfs] sa_buf_hold+0xe/0x20 [zfs] zfs_zget+0x5b/0x280 [zfs] ? zfs_match_find.constprop.0+0x76/0x100 [zfs] zfs_dirent_lock+0x424/0x570 [zfs] zfs_dirlook+0x91/0x2d0 [zfs] zfs_lookup+0x25e/0x410 [zfs] zpl_lookup+0xcf/0x230 [zfs] ? get_acl+0x18/0x50 __lookup_slow+0x85/0x150 walk_component+0x145/0x1c0 ? path_init+0x2c0/0x3f0 path_lookupat+0x6e/0x1c0 ? avl_find+0x68/0xd0 [zavl] filename_lookup+0xcf/0x1d0 ? __check_object_size+0x1d/0x30 ? strncpy_from_user+0x44/0x150 ? getname_flags.part.0+0x4c/0x1b0 user_path_at_empty+0x3f/0x60 vfs_statx+0x7a/0x130 __do_sys_newfstatat+0x36/0x70 ? __secure_computing+0xa9/0x120 ? syscall_trace_enter.constprop.0+0xa7/0x1c0 __x64_sys_newfstatat+0x1c/0x30 do_syscall_64+0x59/0xc0 ? exit_to_user_mode_loop+0x10d/0x160 ? exit_to_user_mode_prepare+0x37/0xb0 ? syscall_exit_to_user_mode+0x27/0x50 ? __x64_sys_close+0x11/0x50 ? do_syscall_64+0x69/0xc0 ? exit_to_user_mode_prepare+0x37/0xb0 ? syscall_exit_to_user_mode+0x27/0x50 ? do_syscall_64+0x69/0xc0 ? exc_page_fault+0x89/0x170 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7f1937dfeeee Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 09 RSP: 002b:00007fff14cf79a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106 RAX: ffffffffffffffda RBX: 000055e201b98a40 RCX: 00007f1937dfeeee RDX: 00007fff14cf7a40 RSI: 00007f1937015860 RDI: 00000000ffffff9c RBP: 000055e201e2c340 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff14cf7a40 R13: 00007f1937c87f10 R14: 00000000ffffff9c R15: 000055e201b98bf0 </TASK> CPU: 2 PID: 11026 Comm: z_wr_int_2 Tainted: P O 5.15.0-48-generic #54-Ubuntu Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022 Call Trace: <TASK> show_stack+0x52/0x5c dump_stack_lvl+0x4a/0x63 dump_stack+0x10/0x16 spl_dumpstack+0x29/0x2f [spl] spl_panic+0xd1/0xe9 [spl] ? spl_kmem_cache_free+0xcd/0x140 [spl] ? kmem_cache_free+0x26c/0x290 ? percpu_counter_add+0xf/0x20 [spl] ? percpu_counter_dec+0x10/0x20 [spl] ? spl_kmem_free_impl+0x29/0x40 [spl] arc_buf_destroy+0xed/0xf0 [zfs] dbuf_issue_final_prefetch_done+0x30/0x40 [zfs] arc_read_done+0x247/0x540 [zfs] zio_done+0x3aa/0xe60 [zfs] zio_execute+0x8f/0xe0 [zfs] taskq_thread+0x21e/0x400 [spl] ? wake_up_q+0x90/0x90 ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs] ? param_set_taskq_kick+0xf0/0xf0 [spl] kthread+0x127/0x150 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x1f/0x30 </TASK>
This should be fixed by 13f2b8fb92c23090b9f6e701c8471aef6b8e917b in 2.1.6.
That said, many (all?) of these backtraces represent different issues. They probably should be filed as such.
This should be fixed by https://github.com/openzfs/zfs/commit/13f2b8fb92c23090b9f6e701c8471aef6b8e917b in 2.1.6.
Good to know :+1:
That said, many (all?) of these backtraces represent different issues. They probably should be filed as such.
I'm aware of this, at this point I'm just using this issue to collect all the call traces we have with ZFS for further analysis.
Occurrences: 1
zfs-2.1.4-0ubuntu0.1
zfs-kmod-2.1.4-0ubuntu0.1
PANIC at zfs_znode.c:339:zfs_znode_sa_init()
Showing stack for process 2833053
CPU: 7 PID: 2833053 Comm: python3 Tainted: P O 5.15.0-48-generic #54-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
Call Trace:
<TASK>
show_stack+0x52/0x5c
dump_stack_lvl+0x4a/0x63
dump_stack+0x10/0x16
spl_dumpstack+0x29/0x2f [spl]
spl_panic+0xd1/0xe9 [spl]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? dmu_buf_replace_user+0x65/0x80 [zfs]
? dmu_buf_set_user+0x13/0x20 [zfs]
? dmu_buf_set_user_ie+0x15/0x20 [zfs]
zfs_znode_sa_init+0xdd/0xf0 [zfs]
zfs_znode_alloc+0x101/0x5d0 [zfs]
? arc_space_consume+0x54/0x100 [zfs]
? dbuf_read_bonus+0x102/0x110 [zfs]
? RW_WRITE_HELD+0x9/0x30 [zfs]
? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
? dbuf_rele_and_unlock+0x134/0x540 [zfs]
? __cond_resched+0x1a/0x50
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? __cond_resched+0x1a/0x50
? down_read+0x13/0xa0
? wmsum_add+0xe/0x20 [zfs]
? dbuf_read+0x1e6/0x5b0 [zfs]
? __cond_resched+0x1a/0x50
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? dnode_rele_and_unlock+0x69/0xf0 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? RW_WRITE_HELD+0x9/0x30 [zfs]
? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
zfs_zget+0x239/0x280 [zfs]
zfs_dirent_lock+0x424/0x570 [zfs]
zfs_dirlook+0x91/0x2d0 [zfs]
zfs_lookup+0x25e/0x410 [zfs]
zpl_lookup+0xcf/0x230 [zfs]
? get_acl+0x18/0x50
__lookup_slow+0x85/0x150
walk_component+0x145/0x1c0
? path_init+0x2c0/0x3f0
path_lookupat+0x6e/0x1c0
? rrm_exit+0x4c/0x80 [zfs]
filename_lookup+0xcf/0x1d0
? __check_object_size+0x1d/0x30
? strncpy_from_user+0x44/0x150
? getname_flags.part.0+0x4c/0x1b0
user_path_at_empty+0x3f/0x60
vfs_statx+0x7a/0x130
__do_sys_newfstatat+0x36/0x70
? __secure_computing+0xa9/0x120
? syscall_trace_enter.constprop.0+0xa7/0x1c0
__x64_sys_newfstatat+0x1c/0x30
do_syscall_64+0x59/0xc0
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f250e35beee
Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 09
RSP: 002b:00007ffdb8772198 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
RAX: ffffffffffffffda RBX: 0000560e8093da40 RCX: 00007f250e35beee
RDX: 00007ffdb8772230 RSI: 00007f2502d7e690 RDI: 00000000ffffff9c
RBP: 0000560e821453c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdb8772230
R13: 00007f250e1e8130 R14: 00000000ffffff9c R15: 0000560e8093dbf0
</TASK>
Occurrence: 3 Did not cause hanging.
zfs-2.1.6-2.1
zfs-kmod-2.1.6-2.1
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 5 PID: 2908962 Comm: z_wr_int_2 Tainted: P OE 5.15.0-52-generic #58-Ubuntu
Hardware name: To Be Filled By O.E.M. E3C252D4U-2T/E3C252D4U-2T/OVH, BIOS 1.31.OV02 01/28/2022
RIP: 0010:abd_iterate_func.part.0+0x31/0x1c0 [zfs]
Code: 89 e5 41 57 4d 89 c7 41 56 41 55 41 54 53 48 89 d3 48 83 ec 68 48 89 7d 88 48 89 4d 80 65 48 8b 04 25 28 00 00 00 48 89 45 d0 <8b> 07 48 89 75 98 41 89 c6 41 c1 ee 06 44 89 f0 83 e0 01 89 45 90
RSP: 0018:ffffbd7f0d3f3b18 EFLAGS: 00010296
RAX: a5e9f645e72d7d00 RBX: 0000000000004000 RCX: ffffffffc0dc0000
RDX: 0000000000004000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffbd7f0d3f3ba8 R08: ffffbd7f0d3f3bb8 R09: ffff976e587cc000
R10: ffff976caf9d7388 R11: ffffffffc101dd00 R12: ffff97682d193d70
R13: 0000000000000007 R14: 0000000000000002 R15: ffffbd7f0d3f3bb8
FS: 0000000000000000(0000) GS:ffff97741fb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001c9dfa004 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
? 0xffffffffc0dc0000
? kmem_cache_alloc+0x1ab/0x2f0
abd_copy_to_buf_off+0x3e/0x60 [zfs]
arc_buf_fill+0x214/0xd00 [zfs]
? aggsum_add+0x1a3/0x1c0 [zfs]
arc_buf_alloc_impl.isra.0+0x2f5/0x4b0 [zfs]
? spl_kmem_cache_free+0x145/0x200 [spl]
arc_read_done+0x106/0x530 [zfs]
zio_done+0x3fe/0x1260 [zfs]
zio_execute+0x94/0x160 [zfs]
taskq_thread+0x29c/0x4c0 [spl]
? wake_up_q+0x90/0x90
? zio_gang_tree_free+0x70/0x70 [zfs]
? taskq_thread_spawn+0x60/0x60 [spl]
kthread+0x127/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: xt_nat veth tls nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay isofs ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables nfnetlink nls_iso8859_1 ipmi_ssif zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) kvm_intel zcommon(POE) znvpair(POE) spl(OE) kvm joydev input_leds acpi_ipmi ipmi_si video acpi_tad sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler msr efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid0 multipath linear ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect
crct10dif_pclmul crc32_pclmul sysimgblt fb_sys_fops ghash_clmulni_intel raid1 cec aesni_intel rc_core nvme crypto_simd ahci xhci_pci cryptd i40e drm libahci nvme_core xhci_pci_renesas
CR2: 0000000000000000
---[ end trace 65809e962f72b871 ]---
RIP: 0010:abd_iterate_func.part.0+0x31/0x1c0 [zfs]
@nim-odoo have you considered trying a FreeBSD box as a test? I'd be interested to know if you get similar problems.
This issue is pretty horrifying.
I got a panic with a similar looking stack trace to the first one here, but the address in my case was 0000000000000000
rather than 000000000000000f
.
In my case I am also not using send/receive, but unlike you I am not using encryption. (<- @rincebrain)
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 45c289067 P4D 45c289067 PUD 1d78af067 PMD 0
Oops: 0002 [#1] SMP NOPTI
CPU: 9 PID: 3483572 Comm: zsysd Tainted: P C OE 5.15.0-52-generic #58-Ubuntu
Hardware name: Micro-Star International Co., Ltd. MS-7C83/MAG B460M BAZOOKA (MS-7C83), BIOS A.00 05/13/2020
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x36/0x310 [zfs]
Code: 48 89 e5 41 57 41 56 4c 8d 34 0e 41 55 4c 8d 2c 17 41 54 53 48 83 ec 28 4c 39 ef 0f 83 65 02 00 00 49 89 ff 49 89 f4 48 89 f7 <41> 0f b6 17 49 8d 47 01 89 d6 c1 ea 04 41 89 d0 83 fa 0f 0f 84 0d
RSP: 0018:ffffbfcc113ff728 EFLAGS: 00010287
RAX: 0000000000000000 RBX: ffff9a67aee1ea0a RCX: ffff9a67aee1ebf8
RDX: ffff9a67aee1ea02 RSI: 0000000000000000 RDI: ffff9a67aee1ea07
RBP: ffffbfcc113ff778 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffffc29ce1b0 R11: ffff9a680a0afad8 R12: ffff9a67aee1ea00
R13: ffff9a67aee1e635 R14: ffff9a67aee1ec00 R15: ffff9a67aee1e609
FS: 00007f9d4c4dc800(0000) GS:ffff9a739ec40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000482e76006 CR4: 00000000007706e0
PKRU: 55555554
Call Trace:
<TASK>
lz4_decompress_zfs+0x2f/0x40 [zfs]
zio_decompress_data_buf+0x8c/0x100 [zfs]
zio_decompress_data+0x51/0xc0 [zfs]
zio_decompress+0x62/0xe0 [zfs]
zio_pop_transforms+0x71/0xa0 [zfs]
zio_done+0x24c/0xe60 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? zio_ready+0x1bf/0x3f0 [zfs]
zio_nowait+0xa6/0x130 [zfs]
arc_read+0xb73/0x15c0 [zfs]
? dbuf_rele_and_unlock+0x540/0x540 [zfs]
? dnode_block_freed+0xdd/0x150 [zfs]
dbuf_read_impl.constprop.0+0x2f2/0x490 [zfs]
dbuf_read+0x1ba/0x5b0 [zfs]
? dmu_buf_hold_noread+0xc3/0x110 [zfs]
dmu_buf_hold+0x66/0xa0 [zfs]
zap_lockdir+0x51/0xb0 [zfs]
zap_lookup_norm+0x5c/0xd0 [zfs]
zap_lookup+0x16/0x20 [zfs]
dsl_dir_hold+0x11b/0x280 [zfs]
dsl_dataset_hold_flags+0x45/0x230 [zfs]
? rrw_enter_read+0x13/0x20 [zfs]
? rrw_enter+0x21/0x30 [zfs]
? dsl_pool_config_enter+0x1d/0x30 [zfs]
dmu_objset_hold_flags+0x78/0xd0 [zfs]
dmu_objset_hold+0x16/0x20 [zfs]
zfs_ioc_objset_stats+0x33/0x80 [zfs]
zfsdev_ioctl_common+0x683/0x740 [zfs]
? __check_object_size.part.0+0x4a/0x150
? _copy_from_user+0x2e/0x70
zfsdev_ioctl+0x57/0xf0 [zfs]
__x64_sys_ioctl+0x92/0xd0
do_syscall_64+0x59/0xc0
? exc_page_fault+0x89/0x170
entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f9d4cd6daff
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
RSP: 002b:00007ffe9ee38080 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffe9ee38100 RCX: 00007f9d4cd6daff
RDX: 00007ffe9ee38100 RSI: 0000000000005a12 RDI: 0000000000000004
RBP: 00007ffe9ee380f0 R08: 0000000000000000 R09: 0000000002af7a50
R10: 00007f9d4ce6d1b0 R11: 0000000000000246 R12: 0000000001eaf9c0
R13: 00007ffe9ee38100 R14: 000000c0000021a0 R15: 00000000000001da
</TASK>
Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache netfs xt_nat veth nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_masq nft_chain_nat nf_nat nf_conntrack ashmem_linux(C) binder_linux nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfnetlink bridge stp llc wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel overlay sunrpc binfmt_misc nls_iso8859_1 joydev input_leds snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine intel_rapl_msr intel_rapl_common intel_tcc_cooling
x86_pkg_temp_thermal intel_powerclamp coretemp nvidia_uvm(POE) kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore mxm_wmi wmi_bmof mei_hdcp mei_me mei ee1004 rapl intel_cstate mac_hid acpi_pad acpi_tad sch_fq_codel ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp ramoops parport reed_solomon efi_pstore pstore_blk pstore_zone ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sysimgblt fb_sys_fops cec
aesni_intel rc_core crypto_simd cryptd drm r8169 ahci i2c_i801 xhci_pci i2c_smbus realtek libahci xhci_pci_renesas wmi video pinctrl_sunrisepoint
CR2: 0000000000000000
---[ end trace fd6f805a9fb1c2cb ]---
RIP: 0010:LZ4_uncompress_unknownOutputSize+0x36/0x310 [zfs]
Code: 48 89 e5 41 57 41 56 4c 8d 34 0e 41 55 4c 8d 2c 17 41 54 53 48 83 ec 28 4c 39 ef 0f 83 65 02 00 00 49 89 ff 49 89 f4 48 89 f7 <41> 0f b6 17 49 8d 47 01 89 d6 c1 ea 04 41 89 d0 83 fa 0f 0f 84 0d
RSP: 0018:ffffbfcc113ff728 EFLAGS: 00010287
RAX: 0000000000000000 RBX: ffff9a67aee1ea0a RCX: ffff9a67aee1ebf8
RDX: ffff9a67aee1ea02 RSI: 0000000000000000 RDI: ffff9a67aee1ea07
RBP: ffffbfcc113ff778 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffffc29ce1b0 R11: ffff9a680a0afad8 R12: ffff9a67aee1ea00
R13: ffff9a67aee1e635 R14: ffff9a67aee1ec00 R15: ffff9a67aee1e609
FS: 00007f9d4c4dc800(0000) GS:ffff9a739ec40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000482e76006 CR4: 00000000007706e0
PKRU: 55555554
Fortunately this has only happened to me once so far, but it's put a damper on my enthusiasm towards ZFS.
Thanks for the reply.
Yes, mine happened on Ubuntu 22.04 (filesystems created on 20.04 and later upgraded).
System information
Describe the problem you're observing
On our system, we observe kernel errors due to
LZ4_uncompress_unknownOutputSize
. It happens on very rare occasions: it happened 3 times within a 6 weeks period on 3 different servers (we have a total of 20 bare metal servers, provider A). It hasn't happened (yet) on the virtualized servers (24 servers, provider B).The main issue is that the complete zpool becomes unavailable: the services using it cannot be restarted, and there is no way to unmount/remount the datasets. Even a soft reboot doesn't succeed: the only solution is the hard reboot.
Therefore, the question is: are we observing the cause or the consequence of a problem? How can this situation of
LZ4_uncompress_unknownOutputSize
be triggered? Moreover, is there a way to recover the zpool without the hard reboot? It seems that when kernel traceback occurs, our only solution is the hard reboot (we already faced it with zfs send / receive).Describe how to reproduce the problem
It happens randomly on very rare occasions. We don't have a way to reproduce it.
Include any warning/errors/backtraces from the system logs