Open Avamander opened 2 years ago
I unplugged all ZFS drives, it's crashing in another location:
BUG: kernel NULL pointer dereference, address: 00000000000006c8
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 1 PID: 1105 Comm: agents Tainted: P O 5.17.0-1013-oem #14-Ubuntu
Hardware name: MSI MS-7850/Z87-G41 PC Mate(MS-7850), BIOS V1.8 07/21/2014
RIP: 0010:mutex_lock+0x1e/0x40
Code: c3 cc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc e8 0d e7 ff ff 31 c0 65 48 8b 14 25 c0 fb 01 00 <f0> 49 0f b1 14 24 75 07 4c 8b 65 f8 c9 c3 cc 4c 89 e7 e8 ab ff ff
RSP: 0018:ffffa647c0c17b98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8f20e1ddb300 RSI: 0000000000000000 RDI: 00000000000006c8
RBP: ffffa647c0c17ba0 R08: ffff8f20da80cea0 R09: ffff8f20da80cea0
R10: 0000000040000000 R11: 0000000000000000 R12: 00000000000006c8
R13: 00000000000006e8 R14: 00000000000006c8 R15: 0000000000000000
FS: 00007fc46af01640(0000) GS:ffff8f27bfa40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000006c8 CR3: 000000012c0d0002 CR4: 00000000001706e0
Call Trace:
<TASK>
rrw_enter_read_impl+0x23/0x180 [zfs]
rrw_enter+0x1d/0x20 [zfs]
dsl_pool_config_enter+0x1d/0x20 [zfs]
spa_prop_get+0x92/0x860 [zfs]
? spl_kmem_free+0x2b/0x40 [spl]
? kfree+0x379/0x410
? mutex_lock+0x13/0x40
? spa_keystore_fini+0x69/0x90 [zfs]
? mutex_lock+0x13/0x40
? spa_deactivate+0x325/0x450 [zfs]
? spa_name_compare+0xe/0x30 [zfs]
? avl_find+0x6b/0xd0 [zavl]
zfs_ioc_pool_get_props+0x7d/0x140 [zfs]
zfsdev_ioctl_common+0x7bb/0x9e0 [zfs]
? _copy_from_user+0x2e/0x70
zfsdev_ioctl+0x57/0xe0 [zfs]
__x64_sys_ioctl+0x92/0xd0
do_syscall_64+0x5c/0xc0
? asm_exc_page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fc46bf8faff
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
RSP: 002b:00007fc46aefb440 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fc45c0025d0 RCX: 00007fc46bf8faff
RDX: 00007fc46aefb4a0 RSI: 0000000000005a27 RDI: 000000000000000b
RBP: 00007fc46aefea80 R08: 00007fc45c000000 R09: 00007fc45c02fca0
R10: 00007fc45c030000 R11: 0000000000000246 R12: 00007fc46aefb4a0
R13: 0000557848334340 R14: 0000000000000000 R15: 00007fc46aefeb20
</TASK>
Modules linked in: overlay lz4 ip6t_REJECT lz4_compress nf_reject_ipv6 zram nft_chain_nat xt_nat xt_MASQUERADE nf_nat xt_addrtype nft_limit xt_LOG nf_log_syslog xt_limit xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 nft_compat nf_tables nfnetlink zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec intel_rapl_msr snd_hda_core intel_rapl_common x86_pkg_temp_thermal snd_hwdep nls_iso8859_1 intel_powerclamp snd_pcm snd_seq_midi kvm_intel snd_seq_midi_event mei_hdcp mei_pxp kvm snd_rawmidi snd_seq rapl snd_seq_device ch341 intel_cstate snd_timer usbserial mei_me snd input_leds at24 mei soundcore tpm_infineon mac_hid tcp_bbr sch_cake coretemp tcp_lp ip6_tables ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp ramoops pstore_blk parport mtd reed_solomon
pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c dm_crypt dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs mlx4_en ib_core hid_generic usbhid hid uas usb_storage i915 i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cec aesni_intel rc_core ahci i2c_i801 crypto_simd mxm_wmi r8169 xhci_pci drm cryptd mlx4_core libahci i2c_smbus lpc_ich realtek xhci_pci_renesas wmi video
CR2: 00000000000006c8
---[ end trace 0000000000000000 ]---
RIP: 0010:mutex_lock+0x1e/0x40
Code: c3 cc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc e8 0d e7 ff ff 31 c0 65 48 8b 14 25 c0 fb 01 00 <f0> 49 0f b1 14 24 75 07 4c 8b 65 f8 c9 c3 cc 4c 89 e7 e8 ab ff ff
RSP: 0018:ffffa647c0c17b98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8f20e1ddb300 RSI: 0000000000000000 RDI: 00000000000006c8
RBP: ffffa647c0c17ba0 R08: ffff8f20da80cea0 R09: ffff8f20da80cea0
R10: 0000000040000000 R11: 0000000000000000 R12: 00000000000006c8
R13: 00000000000006e8 R14: 00000000000006c8 R15: 0000000000000000
FS: 00007fc46af01640(0000) GS:ffff8f27bfa40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000006c8 CR3: 000000012c0d0002 CR4: 00000000001706e0
I removed zfs.cache
and reimported the pool, it is no longer crashing. Yay.
Though this occurrence is a bit spooky in three ways. It is bad that the file can even get corrupted like that (is it not written safely?). It's very bad that there is no validation reading it back in (some sanity checks, please). Lastly, it's terrible that the entire module can crash in so many different ways because of that.
I hope rest of the safety features compensated, but it certainly does not instil confidence.
@Avamander the crashes you observed occurred in unrelated areas of the code and both suggest kernel memory corruption. Is there anything else you changed on the system which might explain why this is no longer happening? It's hard to imagine how removing the cache file would have had any effect on this. It is written safely and rigorously validated.
@behlendorf
Kernel memory corruption sounds incredibly unlikely unless it's ZFS "self-inflicting" it somehow. The crashes persisted and reoccured after reboots (in the same location between kernel and module versions) and it only happened if there was an attempt to mount the ZFS pool.
The only hardware change was to temporarily unplug the pool physically, just to see if ZFS remained stable without the pool. It didn't, the resulting crash is also visible above. That lead me to that file, deleting it made the crash disappear. Then I plugged the pool back in and I didn't encounter the first crash either.
Neither of the crashes have reoccured for 1 day and 2 hours and hopefully the scrub finishes successfully.
It finally imported after four days using an older tgx using -T
, it started scrubbing and then hung.
Now I'm seeing something very similar to this: https://github.com/openzfs/zfs/issues/7603#issuecomment-1128777521
Similar backtrace:
BUG: unable to handle page fault for address: 00000000000032b8
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 2 PID: 1050 Comm: txg_sync Tainted: P O 5.17.0-1014-oem #15-Ubuntu
Hardware name: MSI MS-7850/Z87-G41 PC Mate(MS-7850), BIOS V1.8 07/21/2014
RIP: 0010:mutex_lock+0x1e/0x40
Code: c3 cc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc e8 0d e7 ff ff 31 c0 65 48 8b 14 25 c0 fb 01 00 <f0> 49 0f b1 14 24 75 07 4c 8b 65 f8 c9 c3 cc 4c 89 e7 e8 ab ff ff
RSP: 0018:ffffaa5ec27935a8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000020 RCX: 802a070200070007
RDX: ffff9ef272190000 RSI: 0000000000ff949f RDI: 00000000000032b8
RBP: ffffaa5ec27935b0 R08: 0000000000028000 R09: 00000c04dbec2000
R10: ffff9ef2c9ff6760 R11: ffffaa5ec2793768 R12: 00000000000032b8
R13: ffff9ef322949738 R14: 0000000000000000 R15: 00000001fffffe00
FS: 0000000000000000(0000) GS:ffff9ef93fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000032b8 CR3: 0000000590c10002 CR4: 00000000001706e0
Call Trace:
<TASK>
dsl_scan_scrub_cb+0x4ae/0x940 [zfs]
? __kmalloc_node+0x1c4/0x3e0
? ktime_get_raw_ts64+0x47/0xd0
dsl_scan_visitbp.isra.0+0x739/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x634/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x3c2/0xce0 [zfs]
dsl_scan_visitbp.isra.0+0x888/0xce0 [zfs]
dsl_scan_visit_rootbp.isra.0+0x125/0x1b0 [zfs]
dsl_scan_sync+0x11c0/0x13b0 [zfs]
spa_sync+0x5c6/0x1010 [zfs]
? spa_txg_history_init_io+0x107/0x110 [zfs]
txg_sync_thread+0x2bf/0x450 [zfs]
? txg_register_callbacks+0xb0/0xb0 [zfs]
? __thread_exit+0x20/0x20 [spl]
thread_generic_wrapper+0x64/0x70 [spl]
kthread+0xee/0x120
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Modules linked in: nvme_fabrics overlay ip6t_REJECT nf_reject_ipv6 nft_chain_nat xt_nat lz4 lz4_compress zram xt_MASQUERADE nf_nat xt_addrtype nft_limit xt_LOG nf_log_syslog xt_limit xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 nft_compat nf_tables nfnetlink snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio zfs(PO) snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core zunicode(PO) snd_hwdep intel_rapl_msr nls_iso8859_1 intel_rapl_common snd_pcm zzstd(O) x86_pkg_temp_thermal snd_seq_midi zlua(O) intel_powerclamp snd_seq_midi_event zavl(PO) snd_rawmidi kvm_intel mei_hdcp icp(PO) mei_pxp snd_seq kvm snd_seq_device zcommon(PO) rapl snd_timer znvpair(PO) ch341 intel_cstate spl(O) snd usbserial input_leds mei_me at24 soundcore mei tpm_infineon tcp_bbr mac_hid sch_cake coretemp tcp_lp ip6_tables ipmi_devintf ipmi_msghandler msr parport_pc ppdev lp ramoops parport mtd
reed_solomon pstore_blk efi_pstore pstore_zone ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c dm_crypt dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs mlx4_en ib_core hid_generic usbhid uas usb_storage hid i915 i2c_algo_bit ttm drm_kms_helper syscopyarea crct10dif_pclmul sysfillrect crc32_pclmul sysimgblt ghash_clmulni_intel fb_sys_fops cec aesni_intel rc_core nvme ahci i2c_i801 crypto_simd mxm_wmi r8169 xhci_pci drm mlx4_core nvme_core libahci i2c_smbus lpc_ich cryptd realtek xhci_pci_renesas wmi video
CR2: 00000000000032b8
---[ end trace 0000000000000000 ]---
RIP: 0010:mutex_lock+0x1e/0x40
Code: c3 cc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc e8 0d e7 ff ff 31 c0 65 48 8b 14 25 c0 fb 01 00 <f0> 49 0f b1 14 24 75 07 4c 8b 65 f8 c9 c3 cc 4c 89 e7 e8 ab ff ff
RSP: 0018:ffffaa5ec27935a8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000020 RCX: 802a070200070007
RDX: ffff9ef272190000 RSI: 0000000000ff949f RDI: 00000000000032b8
RBP: ffffaa5ec27935b0 R08: 0000000000028000 R09: 00000c04dbec2000
R10: ffff9ef2c9ff6760 R11: ffffaa5ec2793768 R12: 00000000000032b8
R13: ffff9ef322949738 R14: 0000000000000000 R15: 00000001fffffe00
FS: 0000000000000000(0000) GS:ffff9ef93fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000032b8 CR3: 0000000122108004 CR4: 00000000001706e0
Surprisingly fragile for a supposedly failure-resistant filesystem.
Could this be stack overflow, given the 9 levels of dsl_scan_visitbp recursion?
[30094.728714] general protection fault, probably for non-canonical address 0xfbff895d6de93330: 0000 [#1] PREEMPT SMP NOPTI
[30094.728731] CPU: 4 PID: 3880 Comm: z_wr_int_2 Tainted: P OE 5.19.0-50-generic #50-Ubuntu
[30094.728736] Hardware name: ASUS System Product Name/ProArt X570-CREATOR WIFI, BIOS 1201 04/19/2023
[30094.728738] RIP: 0010:zio_done+0x4ab/0x1270 [zfs]
[30094.728849] Code: 48 89 45 b8 e9 fe 00 00 00 49 8b 8f 30 01 00 00 48 8b 1c 0a 48 39 5d c0 0f 84 12 01 00 00 48 29 cb 48 85 db 0f 84 7d 0b 00 00 <48> 8b 03 48 89 45 d0 4c 89 fe 4c 89 f7 e8 93 62 ff ff 45 8b 6f 74
[30094.728851] RSP: 0018:ffffa2de9ff53d40 EFLAGS: 00010286
[30094.728853] RAX: 0000000000000000 RBX: fbff895d6de93330 RCX: 0000000000000010
[30094.728854] RDX: ffff895d6de931b0 RSI: 0000000000000000 RDI: 0000000000000000
[30094.728855] RBP: ffffa2de9ff53da0 R08: 0000000000000000 R09: 0000000000000000
[30094.728856] R10: 0000000000000000 R11: 0000000000000000 R12: ffff89545325d1c8
[30094.728857] R13: 0000000000200000 R14: ffff895fbe739860 R15: ffff896192e49d40
[30094.728858] FS: 0000000000000000(0000) GS:ffff89722e100000(0000) knlGS:0000000000000000
[30094.728859] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30094.728860] CR2: 000000c001241000 CR3: 0000000115ed0000 CR4: 0000000000750ee0
[30094.728862] PKRU: 55555554
[30094.728862] Call Trace:
[30094.728864] <TASK>
[30094.728867] zio_execute+0x97/0x170 [zfs]
[30094.728913] taskq_thread+0x2aa/0x4d0 [spl]
[30094.728918] ? wake_up_q+0xa0/0xa0
[30094.728923] ? zio_gang_tree_free+0x70/0x70 [zfs]
[30094.728962] ? taskq_thread_spawn+0x60/0x60 [spl]
[30094.728966] kthread+0xee/0x120
[30094.728968] ? kthread_complete_and_exit+0x20/0x20
[30094.728970] ret_from_fork+0x22/0x30
[30094.728973] </TASK>
[30094.728974] Modules linked in: nvme_fabrics rfcomm cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nvidia_uvm(POE) snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg nvidia_drm(POE) snd_intel_sdw_acpi snd_hda_codec nvidia_modeset(POE) zfs(POE) intel_rapl_msr snd_hda_core zunicode(POE) intel_rapl_common snd_hwdep zzstd(OE) snd_pcm iwlmvm edac_mce_amd snd_seq_midi zlua(OE) snd_seq_midi_event zavl(POE) nvidia(POE) nls_iso8859_1 snd_rawmidi asus_ec_sensors icp(POE) mac80211 btusb kvm snd_seq btrtl libarc4 crct10dif_pclmul ghash_clmulni_intel btbcm snd_seq_device zcommon(POE) drm_kms_helper snd_timer aesni_intel btintel ucsi_c
cg crypto_simd btmtk fb_sys_fops znvpair(POE) cryptd iwlwifi snd typec_ucsi syscopyarea sysfillrect rapl wmi_bmof sp
l(OE) input_leds asus_nb_wmi eeepc_wmi intel_wmi_thunderbolt joydev k10temp ccp typec bluetooth soundcore sysimgblt
cfg80211 ecdh_generic ecc mac_hid sch_fq_codel msr parport_pc ppdev lp
[30094.729011] parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore drm ip_tables x_tables autofs4 raid10
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath lin
ear hid_generic usbhid hid mfd_aaeon asus_wmi sparse_keymap ixgbe nvme xfrm_algo platform_profile crc32_pclmul atlan
tic i2c_nvidia_gpu xhci_pci i2c_piix4 i2c_ccgx_ucsi dca ahci igc macsec thunderbolt xhci_pci_renesas nvme_core libah
ci mdio wmi video
[30094.729066] ---[ end trace 0000000000000000 ]---
[30094.842553] RIP: 0010:zio_done+0x4ab/0x1270 [zfs]
[30094.842645] Code: 48 89 45 b8 e9 fe 00 00 00 49 8b 8f 30 01 00 00 48 8b 1c 0a 48 39 5d c0 0f 84 12 01 00 00 48 29
cb 48 85 db 0f 84 7d 0b 00 00 <48> 8b 03 48 89 45 d0 4c 89 fe 4c 89 f7 e8 93 62 ff ff 45 8b 6f 74
[30094.842647] RSP: 0018:ffffa2de9ff53d40 EFLAGS: 00010286
[30094.842650] RAX: 0000000000000000 RBX: fbff895d6de93330 RCX: 0000000000000010
[30094.842651] RDX: ffff895d6de931b0 RSI: 0000000000000000 RDI: 0000000000000000
[30094.842652] RBP: ffffa2de9ff53da0 R08: 0000000000000000 R09: 0000000000000000
[30094.842653] R10: 0000000000000000 R11: 0000000000000000 R12: ffff89545325d1c8
[30094.842654] R13: 0000000000200000 R14: ffff895fbe739860 R15: ffff896192e49d40
[30094.842656] FS: 0000000000000000(0000) GS:ffff89722e100000(0000) knlGS:0000000000000000
[30094.842657] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30094.842659] CR2: 000000c001241000 CR3: 0000000115ed0000 CR4: 0000000000750ee0
[30094.842660] PKRU: 55555554
I'm seem to be getting a similar error. in my case, I'm downloading from 3 rclone processes, each downloading 8 files simultaneously,
System information
5.17.0-1013-oem
amd64
zfs-2.1.5-1ubuntu2
also tested
5.15.0-41-generic
+2.1.2-1ubuntu3
Describe the problem you're observing
When trying to mount a zfs pool the kernel tries to dereference a NULL pointer and all reads stall.
Describe how to reproduce the problem
I can reliably reproduce this on my current setup, no idea how to do it on other machines.
Include any warning/errors/backtraces from the system logs
Older kernel, older ZFS: