openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.56k stars 1.74k forks source link

Kernel NULL pointer dereference #13430

Open jimmyw opened 2 years ago

jimmyw commented 2 years ago

System information

Type Arch Linux x86_64 stable
Distribution Name Arch
Distribution Version stable
Kernel Version 5.17.5-arch1-1
Architecture x86_64
OpenZFS Version 2.1.4-1

Describe the problem you're observing

Server crashes during almost idle operation

Describe how to reproduce the problem

Wait a few days.

May 05 21:32:07 terra kernel: BUG: kernel NULL pointer dereference, address: 0000000000000f30
May 05 21:32:07 terra kernel: #PF: supervisor read access in kernel mode
May 05 21:32:07 terra kernel: #PF: error_code(0x0000) - not-present page
May 05 21:32:07 terra kernel: PGD 0 P4D 0
May 05 21:32:07 terra kernel: Oops: 0000 [#1] PREEMPT SMP PTI
May 05 21:32:07 terra kernel: CPU: 2 PID: 42945 Comm: postgres Tainted: P        W  OE     5.17.5-arch1-1 #1 bff91b48f6c3cb8d3bfd68f772f9c0a96e684769
May 05 21:32:07 terra kernel: Hardware name: HPE ProLiant MicroServer Gen10 Plus/ProLiant MicroServer Gen10 Plus, BIOS U48 10/21/2021
May 05 21:32:07 terra kernel: RIP: 0010:dbuf_read_impl.constprop.0+0xa3/0x6f0 [zfs]
May 05 21:32:07 terra kernel: Code: 8b 7d 28 e8 7f 34 10 00 48 8b 45 28 48 8b 75 58 4c 8b 80 88 00 00 00 48 83 fe ff 0f 84 d0 04 00 00 48 8b 45 60 48 85 c0 74 24 <48> 8b 50 30 48 0f ba e2 27 0f 82 cf 01 00 00 48 83 38 00 0f 85 cf
May 05 21:32:07 terra kernel: RSP: 0018:ffffb90a8bb33a90 EFLAGS: 00010206
May 05 21:32:07 terra kernel: RAX: 0000000000000f00 RBX: 000000000000001e RCX: 0000000000000003
May 05 21:32:07 terra kernel: RDX: 0000000000000002 RSI: 000000000000001e RDI: ffff92603afa6fd0
May 05 21:32:07 terra kernel: RBP: ffff92647ee40000 R08: ffff92603aef0ba0 R09: ffff9260ba3c8000
May 05 21:32:07 terra kernel: R10: 0000000000000000 R11: 000000000000001e R12: 000000000000001e
May 05 21:32:07 terra kernel: R13: 0000000000000001 R14: ffff9260b5778ea0 R15: ffffb90a8bb33ac8
May 05 21:32:07 terra kernel: FS:  00007f9bd6aa8a40(0000) GS:ffff92671ed00000(0000) knlGS:0000000000000000
May 05 21:32:07 terra kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 05 21:32:07 terra kernel: CR2: 0000000000000f30 CR3: 0000000272a7e005 CR4: 00000000003706e0
May 05 21:32:07 terra kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 05 21:32:07 terra kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 05 21:32:07 terra kernel: Call Trace:
May 05 21:32:07 terra kernel:  <TASK>
May 05 21:32:07 terra kernel:  ? dbuf_rele_and_unlock+0x151/0x670 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  dbuf_read+0x109/0x600 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  dmu_buf_hold_array_by_dnode+0x12e/0x5f0 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  dmu_read_uio_dnode+0x5c/0x140 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  dmu_read_uio_dbuf+0x42/0x60 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  zfs_read+0x130/0x3a0 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  zpl_iter_read+0xe2/0x190 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:07 terra kernel:  new_sync_read+0x155/0x1e0
May 05 21:32:07 terra kernel:  vfs_read+0xf1/0x190
May 05 21:32:07 terra kernel:  __x64_sys_pread64+0x8c/0xc0
May 05 21:32:07 terra kernel:  do_syscall_64+0x5c/0x80
May 05 21:32:07 terra kernel:  ? syscall_exit_to_user_mode+0x23/0x40
May 05 21:32:07 terra kernel:  ? do_syscall_64+0x69/0x80
May 05 21:32:07 terra kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
May 05 21:32:07 terra kernel: RIP: 0033:0x7f9bda0b98d6
May 05 21:32:07 terra kernel: Code: 96 00 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 44 00 00 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 90 48 83 ec 28 48 89 54 24 10 48 89 74
May 05 21:32:07 terra kernel: RSP: 002b:00007fff67c775a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
May 05 21:32:07 terra kernel: RAX: ffffffffffffffda RBX: 00007f9bbd7c0700 RCX: 00007f9bda0b98d6
May 05 21:32:07 terra kernel: RDX: 0000000000002000 RSI: 00007f9bbd7c0700 RDI: 0000000000000026
May 05 21:32:07 terra kernel: RBP: 00007fff67c775f0 R08: 000000000a00000d R09: 0000000000000000
May 05 21:32:07 terra kernel: R10: 00000000003d6000 R11: 0000000000000246 R12: 00007f9bd620fab0
May 05 21:32:07 terra kernel: R13: 00000000003d6000 R14: 0000000000002000 R15: 0000000000000001
May 05 21:32:07 terra kernel:  </TASK>
May 05 21:32:07 terra kernel: Modules linked in: tun macvlan ip6table_nat ip6table_filter nf_tables xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter rpcrdma rdma_cm iw_cm ib_cm ib_core ipmi_ssif vfat fat ext4 crc32c_generic crc16 mbcache jbd2 intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate cdc_eem intel_uncore bridge usbnet joydev stp pcspkr mousedev mii ixgbe llc mdio_devres intel_spi_pci igb hpwdt intel_spi hpilo cdc_acm cfg80211 cp210x spi_nor mei_me libphy mdio intel_pch_thermal wmi dca mei mtd acpi_ipmi ipmi_si ipmi_devintf acpi_power_meter ipmi_msghandler rfkill acpi_tad mac_hid nfsd auth_rpcgss nfs_acl lockd grace ip6_tables sunrpc fuse bpf_preload ip_tables x_tables usbhid
May 05 21:32:07 terra kernel:  zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) xhci_pci xhci_pci_renesas
May 05 21:32:07 terra kernel: CR2: 0000000000000f30
May 05 21:32:07 terra kernel: ---[ end trace 0000000000000000 ]---
May 05 21:32:07 terra kernel: RIP: 0010:dbuf_read_impl.constprop.0+0xa3/0x6f0 [zfs]
May 05 21:32:07 terra kernel: Code: 8b 7d 28 e8 7f 34 10 00 48 8b 45 28 48 8b 75 58 4c 8b 80 88 00 00 00 48 83 fe ff 0f 84 d0 04 00 00 48 8b 45 60 48 85 c0 74 24 <48> 8b 50 30 48 0f ba e2 27 0f 82 cf 01 00 00 48 83 38 00 0f 85 cf
May 05 21:32:07 terra kernel: RSP: 0018:ffffb90a8bb33a90 EFLAGS: 00010206
May 05 21:32:07 terra kernel: RAX: 0000000000000f00 RBX: 000000000000001e RCX: 0000000000000003
May 05 21:32:07 terra kernel: RDX: 0000000000000002 RSI: 000000000000001e RDI: ffff92603afa6fd0
May 05 21:32:07 terra kernel: RBP: ffff92647ee40000 R08: ffff92603aef0ba0 R09: ffff9260ba3c8000
May 05 21:32:07 terra kernel: R10: 0000000000000000 R11: 000000000000001e R12: 000000000000001e
May 05 21:32:07 terra kernel: R13: 0000000000000001 R14: ffff9260b5778ea0 R15: ffffb90a8bb33ac8
May 05 21:32:07 terra kernel: FS:  00007f9bd6aa8a40(0000) GS:ffff92671ed00000(0000) knlGS:0000000000000000
May 05 21:32:07 terra kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 05 21:32:07 terra kernel: CR2: 0000000000000f30 CR3: 0000000272a7e005 CR4: 00000000003706e0
May 05 21:32:07 terra kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 05 21:32:07 terra kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 05 21:32:08 terra kernel: BUG: unable to handle page fault for address: 0000000000002cb0
May 05 21:32:08 terra kernel: #PF: supervisor read access in kernel mode
May 05 21:32:08 terra kernel: #PF: error_code(0x0000) - not-present page
May 05 21:32:08 terra kernel: PGD 0 P4D 0
May 05 21:32:08 terra kernel: Oops: 0000 [#2] PREEMPT SMP PTI
May 05 21:32:08 terra kernel: CPU: 3 PID: 511333 Comm: postgres Tainted: P      D W  OE     5.17.5-arch1-1 #1 bff91b48f6c3cb8d3bfd68f772f9c0a96e684769
May 05 21:32:08 terra kernel: Hardware name: HPE ProLiant MicroServer Gen10 Plus/ProLiant MicroServer Gen10 Plus, BIOS U48 10/21/2021
May 05 21:32:08 terra kernel: RIP: 0010:dbuf_read_impl.constprop.0+0xa3/0x6f0 [zfs]
May 05 21:32:08 terra kernel: Code: 8b 7d 28 e8 7f 34 10 00 48 8b 45 28 48 8b 75 58 4c 8b 80 88 00 00 00 48 83 fe ff 0f 84 d0 04 00 00 48 8b 45 60 48 85 c0 74 24 <48> 8b 50 30 48 0f ba e2 27 0f 82 cf 01 00 00 48 83 38 00 0f 85 cf
May 05 21:32:08 terra kernel: RSP: 0018:ffffb90a813efaa8 EFLAGS: 00010202
May 05 21:32:08 terra kernel: RAX: 0000000000002c80 RBX: 000000000000001e RCX: 0000000000000006
May 05 21:32:08 terra kernel: RDX: 0000000000000005 RSI: 0000000000000059 RDI: ffff92603afa6fd0
May 05 21:32:08 terra kernel: RBP: ffff9265d8b73800 R08: ffff92603aef0ba0 R09: ffff92647ee40040
May 05 21:32:08 terra kernel: R10: ffff9265d8b73840 R11: ffff92603aef0e98 R12: 000000000000001e
May 05 21:32:08 terra kernel: R13: 0000000000000001 R14: ffff9263d5f54440 R15: ffffb90a813efae0
May 05 21:32:08 terra kernel: FS:  00007f9bd6aa8a40(0000) GS:ffff92671ed80000(0000) knlGS:0000000000000000
May 05 21:32:08 terra kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 05 21:32:08 terra kernel: CR2: 0000000000002cb0 CR3: 00000001b8ad8006 CR4: 00000000003706e0
May 05 21:32:08 terra kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 05 21:32:08 terra kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
May 05 21:32:08 terra kernel: Call Trace:
May 05 21:32:08 terra kernel:  <TASK>
May 05 21:32:08 terra kernel:  ? dbuf_rele_and_unlock+0x151/0x670 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  dbuf_read+0x109/0x600 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  dmu_buf_hold_array_by_dnode+0x12e/0x5f0 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  dmu_read_uio_dnode+0x5c/0x140 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  dmu_read_uio_dbuf+0x42/0x60 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  zfs_read+0x130/0x3a0 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  zpl_iter_read+0xe2/0x190 [zfs 28318e023a1a5aa0c40c2fbeb5adb4ae1372f122]
May 05 21:32:08 terra kernel:  new_sync_read+0x155/0x1e0
May 05 21:32:08 terra kernel:  vfs_read+0xf1/0x190
May 05 21:32:08 terra kernel:  __x64_sys_pread64+0x8c/0xc0
May 05 21:32:08 terra kernel:  do_syscall_64+0x5c/0x80
May 05 21:32:08 terra kernel:  ? exc_page_fault+0x72/0x170
May 05 21:32:08 terra kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
May 05 21:32:08 terra kernel: RIP: 0033:0x7f9bda0b98d6
May 05 21:32:08 terra kernel: Code: 96 00 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 44 00 00 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 90 48 83 ec 28 48 89 54 24 10 48 89 74
May 05 21:32:08 terra kernel: RSP: 002b:00007fff67c74e68 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
May 05 21:32:08 terra kernel: RAX: ffffffffffffffda RBX: 00007f9bb57ae700 RCX: 00007f9bda0b98d6
May 05 21:32:08 terra kernel: RDX: 0000000000002000 RSI: 00007f9bb57ae700 RDI: 0000000000000017
May 05 21:32:08 terra kernel: RBP: 00007fff67c74eb0 R08: 000000000a00000d R09: 0000000000000000
May 05 21:32:08 terra kernel: R10: 0000000000b34000 R11: 0000000000000246 R12: 00007f9bd6214af0
May 05 21:32:08 terra kernel: R13: 0000000000b34000 R14: 0000000000002000 R15: 0000000000000001
May 05 21:32:08 terra kernel:  </TASK>
May 05 21:32:08 terra kernel: Modules linked in: tun macvlan ip6table_nat ip6table_filter nf_tables xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter rpcrdma rdma_cm iw_cm ib_cm ib_core ipmi_ssif vfat fat ext4 crc32c_generic crc16 mbcache jbd2 intel_rapl_msr intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate cdc_eem intel_uncore bridge usbnet joydev stp pcspkr mousedev mii ixgbe llc mdio_devres intel_spi_pci igb hpwdt intel_spi hpilo cdc_acm cfg80211 cp210x spi_nor mei_me libphy mdio intel_pch_thermal wmi dca mei mtd acpi_ipmi ipmi_si ipmi_devintf acpi_power_meter ipmi_msghandler rfkill acpi_tad mac_hid nfsd auth_rpcgss nfs_acl lockd grace ip6_tables sunrpc fuse bpf_preload ip_tables x_tables usbhid
May 05 21:32:08 terra kernel:  zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) xhci_pci xhci_pci_renesas
May 05 21:32:08 terra kernel: CR2: 0000000000002cb0
May 05 21:32:08 terra kernel: ---[ end trace 0000000000000000 ]---
May 05 21:32:08 terra kernel: RIP: 0010:dbuf_read_impl.constprop.0+0xa3/0x6f0 [zfs]
May 05 21:32:08 terra kernel: Code: 8b 7d 28 e8 7f 34 10 00 48 8b 45 28 48 8b 75 58 4c 8b 80 88 00 00 00 48 83 fe ff 0f 84 d0 04 00 00 48 8b 45 60 48 85 c0 74 24 <48> 8b 50 30 48 0f ba e2 27 0f 82 cf 01 00 00 48 83 38 00 0f 85 cf
May 05 21:32:08 terra kernel: RSP: 0018:ffffb90a8bb33a90 EFLAGS: 00010206
May 05 21:32:08 terra kernel: RAX: 0000000000000f00 RBX: 000000000000001e RCX: 0000000000000003
May 05 21:32:08 terra kernel: RDX: 0000000000000002 RSI: 000000000000001e RDI: ffff92603afa6fd0
May 05 21:32:08 terra kernel: RBP: ffff92647ee40000 R08: ffff92603aef0ba0 R09: ffff9260ba3c8000
May 05 21:32:08 terra kernel: R10: 0000000000000000 R11: 000000000000001e R12: 000000000000001e
May 05 21:32:08 terra kernel: R13: 0000000000000001 R14: ffff9260b5778ea0 R15: ffffb90a8bb33ac8
May 05 21:32:08 terra kernel: FS:  00007f9bd6aa8a40(0000) GS:ffff92671ed80000(0000) knlGS:0000000000000000
May 05 21:32:08 terra kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 05 21:32:08 terra kernel: CR2: 0000000000002cb0 CR3: 00000001b8ad8006 CR4: 00000000003706e0
May 05 21:32:08 terra kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
May 05 21:32:08 terra kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

-->

rincebrain commented 2 years ago

What are the non-default settings on the datasets on the pool?

Have you been using send/recv at all?

Did it work previously any only break like this after a recent upgrade, or is it a recent setup and you have no prior data one way or another?

jimmyw commented 2 years ago
[jimmy@terra ~]$ zpool get all | grep local
nas   autoexpand                     on                             local
nas   ashift                         12                             local
nas   autotrim                       on                             local
nas   feature@async_destroy          enabled                        local
nas   feature@empty_bpobj            active                         local
nas   feature@lz4_compress           active                         local
nas   feature@multi_vdev_crash_dump  enabled                        local
nas   feature@spacemap_histogram     active                         local
nas   feature@enabled_txg            active                         local
nas   feature@hole_birth             active                         local
nas   feature@extensible_dataset     active                         local
nas   feature@embedded_data          active                         local
nas   feature@bookmarks              enabled                        local
nas   feature@filesystem_limits      enabled                        local
nas   feature@large_blocks           enabled                        local
nas   feature@large_dnode            active                         local
nas   feature@sha512                 enabled                        local
nas   feature@skein                  enabled                        local
nas   feature@edonr                  enabled                        local
nas   feature@userobj_accounting     active                         local
nas   feature@encryption             enabled                        local
nas   feature@project_quota          active                         local
nas   feature@device_removal         enabled                        local
nas   feature@obsolete_counts        enabled                        local
nas   feature@zpool_checkpoint       enabled                        local
nas   feature@spacemap_v2            active                         local
nas   feature@allocation_classes     enabled                        local
nas   feature@resilver_defer         enabled                        local
nas   feature@bookmark_v2            enabled                        local
nas   feature@redaction_bookmarks    enabled                        local
nas   feature@redacted_datasets      enabled                        local
nas   feature@bookmark_written       enabled                        local
nas   feature@log_spacemap           active                         local
nas   feature@livelist               active                         local
nas   feature@device_rebuild         enabled                        local
nas   feature@zstd_compress          active                         local
nas   feature@draid                  enabled                        local

No send/recv at all

System have been unstable for a while, have been trying to switch kernels and versions without success. System hangs and complains about a core that was locked. Worked without an issue at all a month ago, not sure what changed.

I hoped this stack trace would help in any way..

jimmyw commented 2 years ago

I see now that the pool is degraded, have not been seeing issues until now. A ssd with cache and log have failed. Pretty sure this have happened after the crashes, but probably related. Will try to replace the drive and see if anything changes..

rincebrain commented 2 years ago

l2arc and slog (cache and log) devices being marked failed shouldn't be actively harmful, though if they were misbehaving while not marked failed who knows.

Specifically, zfs get all | grep -v default was more what I was curious about, though zpool get all is also useful information.

Wondering if 5.17.5 or something shipped something exciting. May go try a build of it on a testbed.

rincebrain commented 2 years ago

Huh, 5.17.5 is running fine with my single NVMe vdev under a LUKS device using the default Arch 5.17.5 kernel .config. We'll see in a few days if it lets the magic smoke out.

Did you use any special settings for LUKS?

mapmot commented 2 years ago

I have been experiencing the same issue on a Dell PowerEdge R610 server / Gentoo / Linux 5.17.5 / ZFS 2.1.4. I don't use LUKS. It started happening after upgrading the kernel to 5.17.5 and glibc to glibc-2.35-r4

r610 /home/dell # zfs get all | grep -v default NAME PROPERTY VALUE SOURCE r610-sas type filesystem - r610-sas creation Sun Feb 21 19:05 2021 - r610-sas used 2.16T - r610-sas available 1.79T - r610-sas referenced 36.5K - r610-sas compressratio 1.23x - r610-sas mounted no - r610-sas recordsize 1M received r610-sas mountpoint none received r610-sas compression zstd received r610-sas atime off received r610-sas aclinherit passthrough-x received r610-sas createtxg 1 - r610-sas xattr sa received r610-sas version 5 - r610-sas utf8only off - r610-sas normalization none - r610-sas casesensitivity sensitive - r610-sas guid 11187035221725804445 - r610-sas usedbysnapshots 0B - r610-sas usedbydataset 36.5K - r610-sas usedbychildren 2.16T - r610-sas usedbyrefreservation 0B - r610-sas objsetid 54 - r610-sas dnodesize auto received r610-sas refcompressratio 1.00x - r610-sas written 36.5K - r610-sas logicalused 2.68T - r610-sas logicalreferenced 12K - r610-sas acltype posix received r610-sas/container/windows type filesystem - r610-sas/container/windows creation Sat Oct 23 8:56 2021 - r610-sas/container/windows used 28.0G - r610-sas/container/windows available 1.79T - r610-sas/container/windows referenced 28.0G - r610-sas/container/windows compressratio 1.46x - r610-sas/container/windows mounted yes - r610-sas/container/windows recordsize 128K local r610-sas/container/windows mountpoint /srv/container/windows inherited from r610-sas/container r610-sas/container/windows compression zstd inherited from r610-sas r610-sas/container/windows atime off inherited from r610-sas r610-sas/container/windows aclinherit passthrough-x inherited from r610-sas r610-sas/container/windows createtxg 3597135 - r610-sas/container/windows xattr sa inherited from r610-sas r610-sas/container/windows version 5 - r610-sas/container/windows utf8only off - r610-sas/container/windows normalization none - r610-sas/container/windows casesensitivity sensitive - r610-sas/container/windows guid 6305482984620260152 - r610-sas/container/windows usedbysnapshots 0B - r610-sas/container/windows usedbydataset 28.0G - r610-sas/container/windows usedbychildren 0B - r610-sas/container/windows usedbyrefreservation 0B - r610-sas/container/windows objsetid 14492 - r610-sas/container/windows dnodesize auto inherited from r610-sas r610-sas/container/windows refcompressratio 1.46x - r610-sas/container/windows written 28.0G - r610-sas/container/windows logicalused 40.9G - r610-sas/container/windows logicalreferenced 40.9G - r610-sas/container/windows acltype posix inherited from r610-sas r610-sas/system type filesystem - r610-sas/system creation Sun Feb 21 19:40 2021 - r610-sas/system used 1.48G - r610-sas/system available 1.79T - r610-sas/system referenced 1.48G - r610-sas/system compressratio 4.05x - r610-sas/system mounted yes - r610-sas/system recordsize 1M inherited from r610-sas r610-sas/system mountpoint none inherited from r610-sas r610-sas/system compression zstd inherited from r610-sas r610-sas/system atime off inherited from r610-sas r610-sas/system aclinherit passthrough-x inherited from r610-sas r610-sas/system createtxg 222 - r610-sas/system xattr sa inherited from r610-sas r610-sas/system version 5 - r610-sas/system utf8only off - r610-sas/system normalization none - r610-sas/system casesensitivity sensitive - r610-sas/system guid 13395046143574701671 - r610-sas/system usedbysnapshots 0B - r610-sas/system usedbydataset 1.48G - r610-sas/system usedbychildren 0B - r610-sas/system usedbyrefreservation 0B - r610-sas/system objsetid 3348 - r610-sas/system dnodesize auto inherited from r610-sas r610-sas/system refcompressratio 4.05x - r610-sas/system written 1.48G - r610-sas/system logicalused 5.57G - r610-sas/system logicalreferenced 5.57G - r610-sas/system acltype posix inherited from r610-sas

mapmot commented 2 years ago

...and this is the the change between -r3 and -r4 of glibc in Gentoo:

diff -Naur glibc-2.35-r3.ebuild glibc-2.35-r4.ebuild - # We take care of patching our binutils to use both hash styles, - # and many people like to force gnu hash style only, so disable - # this overriding check. #347761 - export libc_cv_hashstyle=no

rincebrain commented 2 years ago

@mapmot, can you please share one or more of the BUG: messages and stacktraces from your logs when this happens?

mapmot commented 2 years ago

@rincebrain, here is the log. What triggered it was a git status command, after the server being mostly idle. The last time it happened was after an emerge command.

[735471.464489] BUG: kernel NULL pointer dereference, address: 000000000000000b [735471.464581] #PF: supervisor write access in kernel mode [735471.464637] #PF: error_code(0x0002) - not-present page [735471.464693] PGD 0 P4D 0 [735471.464726] Oops: 0002 [#1] SMP NOPTI [735471.464771] CPU: 23 PID: 108277 Comm: dp_sync_taskq Tainted: P IO 5.17.5-gentoo #2 [735471.464866] Hardware name: Dell Inc. PowerEdge R610/0P8FRD, BIOS 6.6.0 05/22/2018 [735471.464946] RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] [735471.465076] Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 49 8b 42 20 48 85 c0 74 [735471.465238] RSP: 0018:ffff8c04a86d3bd0 EFLAGS: 00010246 [735471.465291] RAX: ffff8c04bf173200 RBX: ffff8c039c054000 RCX: 0000000000000003 [735471.465355] RDX: ffff8c04bf171f10 RSI: 0000000000000286 RDI: ffff8c034b7f9800 [735471.465422] RBP: ffff8c04bf171f10 R08: 0000000000000000 R09: ffff8c04a86d3ae0 [735471.465490] R10: ffff8c04bf173200 R11: dead000000000100 R12: 0000000000000000 [735471.465555] R13: dead000000000122 R14: dead000000000100 R15: ffff8c04bf171f00 [735471.465622] FS: 0000000000000000(0000) GS:ffff8c162fdc0000(0000) knlGS:0000000000000000 [735471.465697] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [735471.465754] CR2: 000000000000000b CR3: 000000005380a006 CR4: 00000000000226e0 [735471.465823] Call Trace: [735471.465857] [735471.465881] dbuf_assign_arcbuf+0x526/0x5b0 [zfs] [735471.465984] dbuf_sync_list+0x43/0x250 [zfs] [735471.466081] dbuf_assign_arcbuf+0x526/0x5b0 [zfs] [735471.466180] dbuf_sync_list+0x43/0x250 [zfs] [735471.466278] dnode_sync+0x3c9/0x13f0 [zfs] [735471.466390] ? __schedule+0x2bd/0xfb0 [735471.466435] dmu_objset_clone+0x7ac/0x8b0 [zfs] [735471.466542] taskq_dispatch+0x4b6/0x6a0 [spl] [735471.466596] ? wake_up_q+0x80/0x80 [735471.466640] ? taskq_dispatch+0x250/0x6a0 [spl] [735471.466692] kthread+0xb4/0xe0 [735471.466729] ? kthread_complete_and_exit+0x20/0x20 [735471.466781] ret_from_fork+0x1f/0x30 [735471.466819] [735471.466844] Modules linked in: vhost_net vhost vhost_iotlb tap veth binfmt_misc intel_powerclamp ipmi_ssif snd_hda_intel snd_intel_dspcfg coretemp snd_hda_codec kvm_intel snd_hda_core snd_pcm snd_timer kvm i7core_edac input_leds snd tpm_tis bridge dcdbas led_class edac_core tpm_tis_core gpio_ich soundcore acpi_power_meter stp wmi ioatdma tpm bfq ipmi_si ipmi_devintf ipmi_msghandler dca llc cfg80211 vfio_pci vfio_pci_core nfsd vfio_virqfd vfio_iommu_type1 auth_rpcgss nfs_acl vfio tun irqbypass fuse lockd grace configfs zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel mpt3sas aesni_intel crypto_simd raid_class cryptd uhci_hcd scsi_transport_sas 8250 bnx2 8250_base serial_mctrl_gpio serial_core sunrpc efivarfs [735471.479131] CR2: 000000000000000b [735471.481696] ---[ end trace 0000000000000000 ]--- [735471.484181] RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] [735471.486816] Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 49 8b 42 20 48 85 c0 74 [735471.491902] RSP: 0018:ffff8c04a86d3bd0 EFLAGS: 00010246 [735471.494444] RAX: ffff8c04bf173200 RBX: ffff8c039c054000 RCX: 0000000000000003 [735471.496965] RDX: ffff8c04bf171f10 RSI: 0000000000000286 RDI: ffff8c034b7f9800 [735471.499516] RBP: ffff8c04bf171f10 R08: 0000000000000000 R09: ffff8c04a86d3ae0 [735471.502019] R10: ffff8c04bf173200 R11: dead000000000100 R12: 0000000000000000 [735471.504583] R13: dead000000000122 R14: dead000000000100 R15: ffff8c04bf171f00 [735471.507105] FS: 0000000000000000(0000) GS:ffff8c162fdc0000(0000) knlGS:0000000000000000 [735471.509600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [735471.512076] CR2: 000000000000000b CR3: 000000005380a006 CR4: 00000000000226e0

mapmot commented 2 years ago

@rincebrain, another one:

May 02 10:57:39 r610.maze kernel: BUG: kernel NULL pointer dereference, address: 000000000000000b May 02 10:57:39 r610.maze kernel: #PF: supervisor write access in kernel mode May 02 10:57:39 r610.maze kernel: #PF: error_code(0x0002) - not-present page May 02 10:57:39 r610.maze kernel: PGD 0 P4D 0 May 02 10:57:39 r610.maze kernel: Oops: 0002 [#1] SMP NOPTI May 02 10:57:39 r610.maze kernel: CPU: 20 PID: 1214 Comm: dp_sync_taskq Tainted: P IO 5.17.4-gentoo #1 May 02 10:57:39 r610.maze kernel: Hardware name: Dell Inc. PowerEdge R610/0P8FRD, BIOS 6.6.0 05/22/2018 May 02 10:57:39 r610.maze kernel: RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] May 02 10:57:39 r610.maze kernel: Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 > May 02 10:57:39 r610.maze kernel: RSP: 0018:ffffa1508d62bbd0 EFLAGS: 00010246 May 02 10:57:39 r610.maze kernel: RAX: ffffa15a93f87800 RBX: ffffa15088cb4180 RCX: 0000000000000003 May 02 10:57:39 r610.maze kernel: RDX: ffffa15bae3a2b10 RSI: ffffffffffffffff RDI: ffffa15087423308 May 02 10:57:39 r610.maze kernel: RBP: ffffa15bae3a2b10 R08: 0000000000000000 R09: ffffffffc0f7e900 May 02 10:57:39 r610.maze kernel: R10: ffffa15a93f87800 R11: 0001514c0001511a R12: 0000000000000000 May 02 10:57:39 r610.maze kernel: R13: dead000000000122 R14: dead000000000100 R15: ffffa15bae3a2b00 May 02 10:57:39 r610.maze kernel: FS: 0000000000000000(0000) GS:ffffa1636fd00000(0000) knlGS:0000000000000000 May 02 10:57:39 r610.maze kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 02 10:57:39 r610.maze kernel: CR2: 000000000000000b CR3: 00000006eba0a003 CR4: 00000000000226e0 May 02 10:57:39 r610.maze kernel: Call Trace: May 02 10:57:39 r610.maze kernel: May 02 10:57:39 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 10:57:39 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 10:57:39 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 10:57:39 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 10:57:39 r610.maze kernel: dnode_sync+0x3c9/0x13f0 [zfs] May 02 10:57:39 r610.maze kernel: ? __schedule+0x2bd/0xfb0 May 02 10:57:39 r610.maze kernel: dmu_objset_clone+0x7ac/0x8b0 [zfs] May 02 10:57:39 r610.maze kernel: taskq_dispatch+0x4b6/0x6a0 [spl] May 02 10:57:39 r610.maze kernel: ? wake_up_q+0x80/0x80 May 02 10:57:39 r610.maze kernel: ? taskq_dispatch+0x250/0x6a0 [spl] May 02 10:57:39 r610.maze kernel: kthread+0xb4/0xe0 May 02 10:57:39 r610.maze kernel: ? kthread_complete_and_exit+0x20/0x20 May 02 10:57:39 r610.maze kernel: ret_from_fork+0x1f/0x30 May 02 10:57:39 r610.maze kernel: May 02 10:57:39 r610.maze kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat vhost_net vhost vhost_iotlb tap veth binfmt_misc ipmi_ssif snd_hda_intel snd_intel_dspcfg intel_powerclamp snd_hda_codec snd> May 02 10:57:39 r610.maze kernel: CR2: 000000000000000b May 02 10:57:39 r610.maze kernel: ---[ end trace 0000000000000000 ]--- May 02 10:57:39 r610.maze kernel: RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] May 02 10:57:39 r610.maze kernel: Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 > May 02 10:57:39 r610.maze kernel: RSP: 0018:ffffa1508d62bbd0 EFLAGS: 00010246 May 02 10:57:39 r610.maze kernel: RAX: ffffa15a93f87800 RBX: ffffa15088cb4180 RCX: 0000000000000003 May 02 10:57:39 r610.maze kernel: RDX: ffffa15bae3a2b10 RSI: ffffffffffffffff RDI: ffffa15087423308 May 02 10:57:39 r610.maze kernel: RBP: ffffa15bae3a2b10 R08: 0000000000000000 R09: ffffffffc0f7e900 May 02 10:57:39 r610.maze kernel: R10: ffffa15a93f87800 R11: 0001514c0001511a R12: 0000000000000000 May 02 10:57:39 r610.maze kernel: R13: dead000000000122 R14: dead000000000100 R15: ffffa15bae3a2b00 May 02 10:57:39 r610.maze kernel: FS: 0000000000000000(0000) GS:ffffa1636fd00000(0000) knlGS:0000000000000000 May 02 10:57:39 r610.maze kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 02 10:57:39 r610.maze kernel: CR2: 000000000000000b CR3: 00000006eba0a003 CR4: 00000000000226e0

mapmot commented 2 years ago

@rincebrain, the last one, this is while trying to gracefully reboot (systemctl reboot). Nothing graceful happened, the watchdog timer reset the system :(

May 02 12:02:42 r610.maze kernel: BUG: kernel NULL pointer dereference, address: 000000000000000b May 02 12:02:42 r610.maze kernel: #PF: supervisor write access in kernel mode May 02 12:02:42 r610.maze kernel: #PF: error_code(0x0002) - not-present page May 02 12:02:42 r610.maze kernel: PGD 0 P4D 0 May 02 12:02:42 r610.maze kernel: Oops: 0002 [#1] SMP NOPTI May 02 12:02:42 r610.maze kernel: CPU: 8 PID: 4352 Comm: txg_sync Tainted: P IO 5.17.5-gentoo #2 May 02 12:02:42 r610.maze kernel: Hardware name: Dell Inc. PowerEdge R610/0P8FRD, BIOS 6.6.0 05/22/2018 May 02 12:02:42 r610.maze kernel: RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] May 02 12:02:42 r610.maze kernel: Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 > May 02 12:02:42 r610.maze kernel: RSP: 0018:ffffa1daab2af938 EFLAGS: 00010246 May 02 12:02:42 r610.maze kernel: RAX: ffffa1da8346f400 RBX: ffffa1da0726f680 RCX: 0000000000000003 May 02 12:02:42 r610.maze kernel: RDX: ffffa1da8346c710 RSI: 0000000000000001 RDI: ffffa1da8346c700 May 02 12:02:42 r610.maze kernel: RBP: ffffa1da8346c710 R08: 0000000000000000 R09: ffffa1dab14ceb40 May 02 12:02:42 r610.maze kernel: R10: ffffa1da8346f400 R11: 0000000000000007 R12: 0000000000000001 May 02 12:02:42 r610.maze kernel: R13: dead000000000122 R14: dead000000000100 R15: ffffa1da8346c700 May 02 12:02:42 r610.maze kernel: FS: 0000000000000000(0000) GS:ffffa1ecefa00000(0000) knlGS:0000000000000000 May 02 12:02:42 r610.maze kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 02 12:02:42 r610.maze kernel: CR2: 000000000000000b CR3: 00000001fa20a003 CR4: 00000000000226e0 May 02 12:02:42 r610.maze kernel: Call Trace: May 02 12:02:42 r610.maze kernel: May 02 12:02:42 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_assign_arcbuf+0x526/0x5b0 [zfs] May 02 12:02:42 r610.maze kernel: dbuf_sync_list+0x43/0x250 [zfs] May 02 12:02:42 r610.maze kernel: dnode_sync+0x3c9/0x13f0 [zfs] May 02 12:02:42 r610.maze kernel: ? dmu_objset_sync+0x196/0x800 [zfs] May 02 12:02:42 r610.maze kernel: dmu_objset_sync+0x1bf/0x800 [zfs] May 02 12:02:42 r610.maze kernel: dsl_dataset_sync+0x68/0x270 [zfs] May 02 12:02:42 r610.maze kernel: dsl_pool_sync+0xb2/0x4f0 [zfs] May 02 12:02:42 r610.maze kernel: spa_sync+0x53a/0x1570 [zfs] May 02 12:02:42 r610.maze kernel: ? spa_txg_history_init_io+0xfc/0x110 [zfs] May 02 12:02:42 r610.maze kernel: txg_register_callbacks+0x33f/0x4f0 [zfs] May 02 12:02:42 r610.maze kernel: ? txg_register_callbacks+0x90/0x4f0 [zfs] May 02 12:02:42 r610.maze kernel: ? __thread_exit+0x10/0x70 [spl] May 02 12:02:42 r610.maze kernel: __thread_exit+0x61/0x70 [spl] May 02 12:02:42 r610.maze kernel: kthread+0xb4/0xe0 May 02 12:02:42 r610.maze kernel: ? kthread_complete_and_exit+0x20/0x20 May 02 12:02:42 r610.maze kernel: ret_from_fork+0x1f/0x30 May 02 12:02:42 r610.maze kernel: May 02 12:02:42 r610.maze kernel: Modules linked in: vhost_net vhost vhost_iotlb tap veth binfmt_misc ipmi_ssif snd_hda_intel snd_intel_dspcfg intel_powerclamp snd_hda_codec snd_hda_core bridge snd_pcm stp core> May 02 12:02:42 r610.maze kernel: CR2: 000000000000000b May 02 12:02:42 r610.maze kernel: ---[ end trace 0000000000000000 ]--- May 02 12:02:42 r610.maze kernel: RIP: 0010:dbuf_sync_list+0x67/0x250 [zfs] May 02 12:02:42 r610.maze kernel: Code: 0e e8 5d fe ff ff 49 8b 47 10 48 39 c5 74 5e 49 8b 47 10 49 89 c2 4d 2b 57 08 74 51 49 83 7a 18 00 75 4a 48 8b 08 48 8b 50 08 <48> 89 51 08 48 89 0a 4c 89 30 4c 89 68 08 > May 02 12:02:42 r610.maze kernel: RSP: 0018:ffffa1daab2af938 EFLAGS: 00010246 May 02 12:02:42 r610.maze kernel: RAX: ffffa1da8346f400 RBX: ffffa1da0726f680 RCX: 0000000000000003 May 02 12:02:42 r610.maze kernel: RDX: ffffa1da8346c710 RSI: 0000000000000001 RDI: ffffa1da8346c700 May 02 12:02:42 r610.maze kernel: RBP: ffffa1da8346c710 R08: 0000000000000000 R09: ffffa1dab14ceb40 May 02 12:02:42 r610.maze kernel: R10: ffffa1da8346f400 R11: 0000000000000007 R12: 0000000000000001 May 02 12:02:42 r610.maze kernel: R13: dead000000000122 R14: dead000000000100 R15: ffffa1da8346c700 May 02 12:02:42 r610.maze kernel: FS: 0000000000000000(0000) GS:ffffa1ecefa00000(0000) knlGS:0000000000000000 May 02 12:02:42 r610.maze kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 02 12:02:42 r610.maze kernel: CR2: 000000000000000b CR3: 00000001fa20a003 CR4: 00000000000226e0 May 02 12:04:40 r610.maze kernel: hrtimer: interrupt took 9116 ns

rincebrain commented 2 years ago

What's the storage backing this pool in what configuration?

I have some recent suspicions about one or two places that might be doing things incorrectly that this might align with.

mapmot commented 2 years ago

What's the storage backing this pool in what configuration?

I have some recent suspicions about one or two places that might be doing things incorrectly that this might align with.

6 x 1 TB SAS drives in raidz1 on Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 in IT/HBA mode.

stale[bot] commented 1 year ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.