koverstreet / bcachefs

Other
694 stars 72 forks source link

IBT support [7a90ae891e4a] #436

Open FlyingWombat opened 2 years ago

FlyingWombat commented 2 years ago

Edit Kernel aborts mount with "Kernel BUG" traceback when IBT is enabled. Indirect Branch Tracking (IBT) was introduced in kernel 5.18 with 11+ gen Intel CPUs.

A workaround is to set ibt=off kernel parameter. Compiling with -fcf-protection=branch did not fix.

If this is an upstream issue, feel free to close.


Original text:

I moved my hard drives to a new PC and rebuilt the kernel with the exact same config as before. Got this kernel bug when I tried to mount. All 3 drives are readable with bcachefs show-super and dd, so they're not broken.

Dmesg output:

[  295.327453] bcachefs: bch2_fs_open()
[  295.327456] bcachefs: bch2_read_super()
[  295.328878] bcachefs: bch2_read_super() ret 0
[  295.328880] bcachefs: bch2_read_super()
[  295.333755] bcachefs: bch2_read_super() ret 0
[  295.333764] bcachefs: bch2_read_super()
[  295.337459] bcachefs: bch2_read_super() ret 0
[  295.337468] bcachefs: bch2_fs_alloc()
[  295.391731] bcachefs: bch2_fs_journal_init()
[  295.392003] bcachefs: bch2_fs_journal_init() ret 0
[  295.392017] bcachefs: bch2_fs_btree_cache_init()
[  295.392458] bcachefs: bch2_fs_btree_cache_init() ret 0
[  295.392519] bcachefs: bch2_fs_encryption_init()
[  295.392526] bcachefs: bch2_fs_encryption_init() ret 0
[  295.392528] bcachefs: __bch2_fs_compress_init()
[  295.392528] bcachefs: __bch2_fs_compress_init() ret 0
[  295.392534] bcachefs: bch2_fs_fsio_init()
[  295.392558] bcachefs: bch2_fs_fsio_init() ret 0
[  295.392559] bcachefs: bch2_dev_alloc()
[  295.393141] bcachefs: bch2_dev_alloc() ret 0
[  295.393142] bcachefs: bch2_dev_alloc()
[  295.393675] bcachefs: bch2_dev_alloc() ret 0
[  295.393677] bcachefs: bch2_dev_alloc()
[  295.394217] bcachefs: bch2_dev_alloc() ret 0
[  295.394303] bcachefs: bch2_fs_alloc() ret 0
[  295.394368] bcachefs (50acd022-b147-4ac2-a47a-36c9f0a239fb): recovering from clean shutdown, journal seq 9868016
[  295.413759] traps: Missing ENDBR: 0xffffa0ea41109000
[  295.413766] ------------[ cut here ]------------
[  295.413768] kernel BUG at arch/x86/kernel/traps.c:252!
[  295.413774] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  295.413780] CPU: 14 PID: 3008 Comm: mount.bcachefs Not tainted 5.18.10-arch1-1-bcachefs-git-01448-g7a90ae891e4a #1 3c8770cc64cdfa21c17315a56a1134ce476a3d8e
[  295.413788] Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI DDR4(MS-7D25), BIOS 1.70 06/23/2022
[  295.413790] RIP: 0010:exc_control_protection+0xc2/0xd0
[  295.413796] Code: 8b 93 80 00 00 00 be f9 00 00 00 48 c7 c7 73 69 4a 86 e8 e1 23 43 ff e9 72 ff ff ff 48 c7 c7 5a 69 4a 86 e8 cb 51 fa ff 0f 0b <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 0f 1f 00 55 53 48 89
[  295.413799] RSP: 0018:ffffa0ea4596f4b8 EFLAGS: 00010002
[  295.413804] RAX: 0000000000000028 RBX: ffffa0ea4596f4d8 RCX: 0000000000000000
[  295.413806] RDX: 0000000000000000 RSI: ffff8a0c903a16a0 RDI: ffff8a0c903a16a0
[  295.413808] RBP: 0000000000000003 R08: 0000000000000000 R09: ffffa0ea4596f2d8
[  295.413809] R10: 0000000000000003 R11: ffff8a0cb07a9468 R12: 0000000000000000
[  295.413811] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  295.413813] FS:  00007f4152c55d00(0000) GS:ffff8a0c90380000(0000) knlGS:0000000000000000
[  295.413816] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  295.413818] CR2: 00007ff8d549fd37 CR3: 000000011375c001 CR4: 0000000000f70ee0
[  295.413820] PKRU: 55555554
[  295.413822] Call Trace:
[  295.413824]  <TASK>
[  295.413827]  asm_exc_control_protection+0x22/0x30
[  295.413831] RIP: 0010:0xffffa0ea41109000
[  295.413835] Code: Unable to access opcode bytes at RIP 0xffffa0ea41108fd6.
[  295.413837] RSP: 0018:ffffa0ea4596f588 EFLAGS: 00010297
[  295.413839] RAX: ffffa0ea41109000 RBX: 0000000000000000 RCX: 0000000000000000
[  295.413841] RDX: 0000000000000000 RSI: ffff8a05881000a0 RDI: ffffa0ea4596f5f0
[  295.413843] RBP: ffffa0ea4596f6a0 R08: 0000000000000000 R09: 0000000000000016
[  295.413845] R10: ffffa0ea4596f6c0 R11: ffff8a05881000a0 R12: ffff8a0588100088
[  295.413846] R13: ffff8a056d200e00 R14: ffff8a0588100088 R15: ffff8a05881000a0
[  295.413850]  ? krealloc+0x86/0xc0
[  295.413854]  ? validate_bset_keys.constprop.0+0x113/0x7c0
[  295.413860]  ? mempool_alloc+0x86/0x1b0
[  295.413864]  ? bch2_btree_node_read_done+0x4c4/0x1300
[  295.413867]  ? bch2_btree_node_read_done+0x4c4/0x1300
[  295.413872]  ? btree_node_read_work+0x215/0x2f0
[  295.413875]  ? btree_node_read_work+0x215/0x2f0
[  295.413878]  ? bch2_btree_node_read+0x24a/0x4b0
[  295.413881]  ? bch2_btree_node_hash_insert+0x4c/0xb0
[  295.413886]  ? bch2_btree_root_read+0xf8/0x1e0
[  295.413889]  ? bch2_fs_recovery.cold+0x833/0x1830
[  295.413894]  ? get_page_from_freelist+0x13d5/0x1500
[  295.413898]  ? idr_alloc_u32+0xa3/0xe0
[  295.413903]  ? string_nocheck+0xb8/0xf0
[  295.413907]  ? prt_vprintf+0x1f3/0xa10
[  295.413910]  ? __kernfs_new_node+0x17f/0x1e0
[  295.413914]  ? __bch2_sb_field_resize+0x6f/0x100
[  295.413919]  ? __copy_super+0x1dd/0x210
[  295.413923]  ? bch2_fs_start+0x3fb/0x430
[  295.413926]  ? bch2_fs_open+0x584/0x600
[  295.413930]  ? bch2_mount+0x534/0x6c0
[  295.413935]  ? legacy_get_tree+0x28/0x50
[  295.413938]  ? vfs_get_tree+0x26/0xc0
[  295.413941]  ? path_mount+0x46b/0xac0
[  295.413945]  ? __x64_sys_mount+0x117/0x150
[  295.413947]  ? do_syscall_64+0x5c/0x90
[  295.413950]  ? exc_page_fault+0x74/0x170
[  295.413953]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  295.413957]  </TASK>
[  295.413959] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq ccm intel_rapl_msr spi_nor iTCO_wdt intel_pmc_bxt ee1004 iTCO_vendor_support mtd mei_pxp mei_hdcp btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic wmi_bmof mxm_wmi mousedev snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device mc intel_rapl_common intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl intel_cstate intel_uncore pcspkr iwlmvm snd_sof_pci_intel_tgl snd_sof_intel_hda_common mac80211 soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci libarc4 snd_sof_xtensa_dsp snd_sof snd_hda_codec_realtek snd_hda_codec_generic snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus ledtrig_audio snd_soc_core iwlwifi snd_compress i2c_i801 spi_intel_pci ac97_bus spi_intel iwlmei i2c_smbus snd_pcm_dmaengine igc cfg80211
[  295.414016]  snd_hda_codec_hdmi mei_me mei amdgpu i915 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep joydev snd_pcm gpu_sched snd_timer drm_ttm_helper drm_buddy ttm snd intel_gtt drm_dp_helper soundcore serial_multi_instantiate nct6683 rfkill coretemp tpm_crb wmi tpm_tis tpm_tis_core tpm video rng_core acpi_tad acpi_pad vfat fat mac_hid dm_multipath dm_mod fuse bpf_preload ip_tables x_tables ext4 crc16 mbcache jbd2 usbhid nvme crc32c_intel xhci_pci nvme_core xhci_pci_renesas
[  295.414056] ---[ end trace 0000000000000000 ]---
[  295.414058] RIP: 0010:exc_control_protection+0xc2/0xd0
[  295.414061] Code: 8b 93 80 00 00 00 be f9 00 00 00 48 c7 c7 73 69 4a 86 e8 e1 23 43 ff e9 72 ff ff ff 48 c7 c7 5a 69 4a 86 e8 cb 51 fa ff 0f 0b <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 0f 1f 00 55 53 48 89
[  295.414065] RSP: 0018:ffffa0ea4596f4b8 EFLAGS: 00010002
[  295.414066] RAX: 0000000000000028 RBX: ffffa0ea4596f4d8 RCX: 0000000000000000
[  295.414068] RDX: 0000000000000000 RSI: ffff8a0c903a16a0 RDI: ffff8a0c903a16a0
[  295.414069] RBP: 0000000000000003 R08: 0000000000000000 R09: ffffa0ea4596f2d8
[  295.414071] R10: 0000000000000003 R11: ffff8a0cb07a9468 R12: 0000000000000000
[  295.414072] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  295.414074] FS:  00007f4152c55d00(0000) GS:ffff8a0c90380000(0000) knlGS:0000000000000000
[  295.414076] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  295.414077] CR2: ffffa0ea41108fd6 CR3: 000000011375c001 CR4: 0000000000f70ee0
[  295.414079] PKRU: 55555554

Version Linux v5.18.10 bcachefs commit 7a90ae891e4ad96fe2d665c88088730c98c4e79e bcachefs-tools commit bad0c8c50758b4447d529f61017c1a8c85976a3e

Generic info superblock info: bcachefs-general-info.txt

system info (that's changed): cpu: intel i7-12700k (alder lake) mb: MSI Pro z690-a wifi ddr4 boot drive: nvme-Samsung_SSD_970_EVO_Plus

Building kernel with bcachefs debug options yielded a slightly different call trace: bcachefs-5.18.10-g7a90ae891e4a-debug-trace.txt

YellowOnion commented 2 years ago

Looks similar to this: https://github.com/umlaeute/v4l2loopback/issues/476

FlyingWombat commented 2 years ago

Ya, IBT is the culprit: booting with ibt=off worked. Similar problem reported here too, maybe we can see how they deal with it: https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256