Open maartenvanmalland opened 2 months ago
Inspired by https://github.com/koverstreet/bcachefs/issues/672, here some extra info as the effect of this crash is indeed a hanging filesystem as well:
# cat /sys/fs/bcachefs/*/internal/journal_debug
dirty journal entries: 763/32768
seq: 6003
seq_ondisk: 5999
last_seq: 5241
last_seq_ondisk: 5241
flushed_seq_ondisk: 5890
watermark: stripe
each entry reserved: 401
nr flush writes: 284
nr noflush writes: 5699
average write size: 246 KiB
nr direct reclaim: 0
nr background reclaim: 544622
reclaim kicked: 0
reclaim runs in: 12 ms
blocked: 0
current entry sectors: 512
current entry error: ok
current entry: closed
unwritten entries:
seq: 6000
refcount: 0
size: 136 KiB
expires: -135360 jiffies
flags: must_flush separate_flush write_started write allocated
seq: 6001
refcount: 0
size: 255 KiB
expires: -135109 jiffies
flags: noflush separate_flush write_started write allocated write done
seq: 6002
refcount: 0
size: 255 KiB
expires: -135109 jiffies
flags: noflush separate_flush write_started write allocated write done
seq: 6003
refcount: 0
size: 255 KiB
expires: -135109 jiffies
flags: noflush separate_flush write_started write allocated write done
last buf closed
replay done: 1
space:
discarded 512:4193280
clean ondisk 512:4193280
clean 512:4193280
total 512:4194304
dev 0:
nr 1582
bucket size 512
available 818:0
discard_idx 307
dirty_ondisk 307 (seq 5241)
dirty_idx 307 (seq 5241)
cur_idx 1069 (seq 6003)
dev 1:
nr 8192
bucket size 512
available 8190:0
discard_idx 0
dirty_ondisk 0 (seq 0)
dirty_idx 0 (seq 0)
cur_idx 0 (seq 0)
# cat /sys/fs/bcachefs/caa656c1-a03a-4fda-a11d-dd1f0ac8d896/dev-0/alloc_debug
buckets sectors fragmented
free 129877 0 0
sb 13 6152 504
journal 1582 809984 0
btree 106 54272 0
user 5898 3013288 6488
cached 801 410912 0
parity 0 0 0
stripe 0 0 0
need_gc_gens 0 0 0
need_discard 64291 0 0
reserves:
stripe 6386
normal 3221
copygc 56
btree 28
btree_copygc 0
reclaim 0
interior_updates 0
freelist_wait empty
open buckets allocated 107
open buckets this dev 101
open buckets total 1024
open_buckets_wait empty
open_buckets_btree 28
open_buckets_user 78
buckets_to_invalidate 0
btree reserve cache 28`
# cat /sys/kernel/debug/bcachefs/caa656c1-a03a-4fda-a11d-dd1f0ac8d896/btree_transactions
2075159 bch2_moving_ctxt_init
backtrace:
[<0>] kthread+0xde/0x110
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1a/0x30
2075160 bch2_moving_ctxt_init
backtrace:
[<0>] bch2_rebalance_thread+0x66/0xb0 [bcachefs]
[<0>] kthread+0xde/0x110
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1a/0x30
2177571 bch2_do_discards_work
backtrace:
[<0>] bch2_discard_one_bucket+0x88b/0x8c0 [bcachefs]
[<0>] bch2_do_discards_work+0x1f5/0x3a0 [bcachefs]
[<0>] process_one_work+0x189/0x3b0
[<0>] worker_thread+0x277/0x390
[<0>] kthread+0xde/0x110
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1a/0x30
2182976 btree_node_write_work
backtrace:
[<0>] __bch2_trans_commit+0xd18/0x16f0 [bcachefs]
[<0>] __bch2_btree_node_update_key+0x28c/0x570 [bcachefs]
[<0>] bch2_btree_node_update_key+0x35f/0x3a0 [bcachefs]
[<0>] bch2_btree_node_update_key_get_iter+0x1d5/0x330 [bcachefs]
[<0>] btree_node_write_work+0x10a/0x460 [bcachefs]
[<0>] process_one_work+0x189/0x3b0
[<0>] worker_thread+0x277/0x390
[<0>] kthread+0xde/0x110
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1a/0x30
3914811 btree_node_write_work
backtrace:
[<0>] __bch2_trans_commit+0xd18/0x16f0 [bcachefs]
[<0>] __bch2_btree_node_update_key+0x28c/0x570 [bcachefs]
[<0>] bch2_btree_node_update_key+0x35f/0x3a0 [bcachefs]
[<0>] bch2_btree_node_update_key_get_iter+0x1d5/0x330 [bcachefs]
[<0>] btree_node_write_work+0x10a/0x460 [bcachefs]
[<0>] process_one_work+0x189/0x3b0
[<0>] worker_thread+0x277/0x390
[<0>] kthread+0xde/0x110
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1a/0x30
Full dmesg output: http://vpaste.net/FineR
Ok, so the journal got stuck. I just started seeing a similar bug in one of the tests - let me see if I can get that one figured out and maybe add some more debug info.
That would be great. I can test things for you as well if needed; I can let it crash on command :)
I think I had same problem? Nothing stuck, external hdd with bcachefs, deleting many small files
------------[ cut here ]------------
btree trans held srcu lock (delaying memory reclaim) for 18 seconds
WARNING: CPU: 1 PID: 7310 at fs/bcachefs/btree_iter.c:2827 bch2_trans_srcu_unlock+0x126/0x130 [bcachefs]
Modules linked in: wireguard ip6_udp_tunnel udp_tunnel libchacha20poly1305 poly1305_x86_64 chacha_x86_64 curve25519_x86_64 libcurve25519_generic libchacha bcachefs raid6_pq xor uas usb_storage uinput snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm fuse uhid bnep ccm algif_aead crypto_null des3_ede_x86_64 cbc des_generic libdes algif_skcipher cmac md4 algif_hash af_alg xfs vfat fat hid_logitech_hidpp hid_logitech_dj usbhid uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc amdgpu joydev mousedev snd_hda_codec_realtek snd_hda_codec_generic drm_buddy snd_hda_codec_hdmi drm_exec gpu_sched btusb amdxcp crct10dif_pclmul btrtl drm_suballoc_helper crc32_pclmul rtw88_8822ce drm_ttm_helper crc32c_intel btintel rtw88_8822c ttm polyval_clmulni snd_hda_intel snd_pci_ps i2c_algo_bit rtw88_pci polyval_generic snd_intel_dspcfg snd_rpl_pci_acp6x gf128mul drm_display_helper bluetooth ghash_clmulni_intel snd_acp_pci sha512_ssse3 snd_acp_legacy_common snd_hda_codec
hid_multitouch rtw88_core sha256_ssse3 r8169 cec snd_pci_acp6x sha1_ssse3 serio_raw snd_hwdep ecdh_generic mac80211 aesni_intel drm_kms_helper atkbd snd_hda_core snd_pci_acp5x realtek crypto_simd i2c_hid_acpi vivaldi_fmap libarc4 pkcs8_key_parser i2c_hid libps2 wmi_bmof cryptd cfg80211 snd_pcm mdio_devres drm sp5100_tco snd_timer snd_rn_pci_acp3x snd_acp_config ahci snd_soc_acpi libphy snd_pci_acp3x ideapad_laptop snd xhci_pci libahci platform_profile k10temp i2c_piix4 xhci_pci_renesas sparse_keymap rfkill i2c_designware_platform i8042 serio video tpm_crb ccp tpm_tis wmi tpm_tis_core rng_core tpm i2c_designware_core
CPU: 1 PID: 7310 Comm: bch-reclaim/sdb Not tainted 6.8.9-gentoo #1
Hardware name: LENOVO 82KD/LNVNB161216, BIOS GLCN50WW 07/04/2022
RIP: 0010:bch2_trans_srcu_unlock+0x126/0x130 [bcachefs]
Code: 80 97 b1 d1 48 c7 c7 55 6b be c1 48 29 c2 48 b8 cf f7 53 e3 a5 9b c4 20 48 c1 ea 03 c4 e2 cb f6 f0 48 c1 ee 04 e8 8a 2e 59 d0 <0f> 0b eb a0 00 00 00 00 00 00 66 0f 1f 00 53 48 c7 c0 40 a1 bb c1
RSP: 0018:ffffa42f42dc3b70 EFLAGS: 00010246
RAX: ac8defec8a9a4300 RBX: ffff8c5e4c204000 RCX: 0000000000000027
RDX: c0000000ffffdfff RSI: 0000000000000002 RDI: ffff8c5f3e65bb08
RBP: 0000000000000000 R08: 0000000000001fff R09: ffffffff9364aa70
R10: 0000000000005ffd R11: 0000000000000004 R12: ffff8c5e4c204000
R13: ffff8c5dfe4a6c40 R14: ffff8c5dfe480000 R15: ffff8c5dfe480000
FS: 0000000000000000(0000) GS:ffff8c5f3e640000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa15a7befe0 CR3: 0000000185610000 CR4: 0000000000350ef0
Call Trace:
<TASK>
? __warn+0xca/0x1c0
? bch2_trans_srcu_unlock+0x126/0x130 [bcachefs]
? report_bug+0x16a/0x210
? handle_bug+0x3d/0x80
? exc_invalid_op+0x1a/0x50
? asm_exc_invalid_op+0x1a/0x20
? bch2_trans_srcu_unlock+0x126/0x130 [bcachefs]
bch2_trans_begin+0x4aa/0x500 [bcachefs]
bch2_btree_write_buffer_flush_locked+0xbd/0xb00 [bcachefs]
? __schedule+0x5a8/0x12e0
? asm_sysvec_apic_timer_interrupt+0x1a/0x20
? schedule+0x6e/0xc0
? schedule_preempt_disabled+0x10/0x20
? __mutex_lock+0x1a8/0x650
btree_write_buffer_flush_seq+0x249/0x2c0 [bcachefs]
? local_clock_noinstr+0x3a/0xb0
? local_clock+0x10/0x30
bch2_btree_write_buffer_journal_flush+0x34/0x60 [bcachefs]
journal_flush_pins+0x1f8/0x310 [bcachefs]
? bch2_journal_key_to_wb_slowpath+0x230/0x230 [bcachefs]
__bch2_journal_reclaim+0x1f6/0x370 [bcachefs]
bch2_journal_reclaim_thread+0x84/0x170 [bcachefs]
? bch2_journal_reclaim_start+0xe0/0xe0 [bcachefs]
kthread+0xe3/0x100
? kthread_blkcg+0x30/0x30
ret_from_fork+0x32/0x40
? kthread_blkcg+0x30/0x30
ret_from_fork_asm+0x11/0x20
</TASK>
---[ end trace 0000000000000000 ]---```
Gentoo, kernel 6.8.9
I have had the same issue as well:
------------[ cut here ]------------
btree trans held srcu lock (delaying memory reclaim) for 10 seconds
WARNING: CPU: 14 PID: 70131 at fs/bcachefs/btree_iter.c:2825 bch2_trans_put+0x250/0x280>
Modules linked in: udf crc_itu_t isofs cdrom bcachefs xor raid6_pq libcrc32c crc32c_gen>
ip_tables x_tables f2fs crc32_generic lz4hc_compress lz4_compress hid_generic usbhid a>
CPU: 14 PID: 70131 Comm: mk-screens Tainted: G OE 6.8.9-arch1-1 #1 b1154>
Hardware name: Gigabyte Technology Co., Ltd. B550 AORUS ELITE/B550 AORUS ELITE, BIOS F1>
RIP: 0010:bch2_trans_put+0x250/0x280 [bcachefs]
Code: 48 8b 15 d3 88 42 c7 48 c7 c7 e8 84 12 c2 48 b8 07 3a 6d a0 d3 06 3a 6d 48 29 ca >
RSP: 0018:ffffbe69b14d7658 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff958661450000 RCX: 0000000000000027
RDX: ffff958d5eba19c8 RSI: 0000000000000001 RDI: ffff958d5eba19c0
RBP: ffff95878b580000 R08: 0000000000000000 R09: ffffbe69b14d74e8
R10: ffffbe69b14d74e0 R11: 0000000000000003 R12: ffff958abf1cbfc0
R13: ffff958661450000 R14: ffff95878b5a63a8 R15: ffff95878b580000
FS: 000071db176006c0(0000) GS:ffff958d5eb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000078c0b91ed7f0 CR3: 00000004ceb36000 CR4: 0000000000350ef0
Call Trace:
<TASK>
? bch2_trans_put+0x250/0x280 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
? __warn+0x81/0x130
? bch2_trans_put+0x250/0x280 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
? report_bug+0x171/0x1a0
? prb_read_valid+0x1b/0x30
? srso_return_thunk+0x5/0x5f
? handle_bug+0x3c/0x80
? exc_invalid_op+0x17/0x70
? asm_exc_invalid_op+0x1a/0x20
? bch2_trans_put+0x250/0x280 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
? bch2_trans_put+0x250/0x280 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
__bch2_create+0x5f8/0x6f0 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
? srso_return_thunk+0x5/0x5f
? bch2_create+0x2a/0x60 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
bch2_create+0x2a/0x60 [bcachefs 267e5ec8bb3d305ae65d058b667703fa6d81777c]
path_openat+0xe9f/0x1190
do_filp_open+0xb3/0x160
do_sys_openat2+0xab/0xe0
__x64_sys_openat+0x57/0xa0
do_syscall_64+0x83/0x170
? srso_return_thunk+0x5/0x5f
? get_page_from_freelist+0x18e5/0x1a80
? srso_return_thunk+0x5/0x5f
? file_tty_write.isra.0+0x219/0x330
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? __alloc_pages+0x1a3/0x350
? alloc_pages_mpol+0x95/0x1f0
? srso_return_thunk+0x5/0x5f
? __pte_offset_map+0x1b/0x180
? srso_return_thunk+0x5/0x5f
? __mod_memcg_lruvec_state+0x97/0x110
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
? set_ptes.isra.0+0x28/0x90
? srso_return_thunk+0x5/0x5f
? do_anonymous_page+0x357/0x6e0
? pmdp_collapse_flush+0x10/0x60
? srso_return_thunk+0x5/0x5f
? __handle_mm_fault+0xb4a/0xe40
? do_syscall_64+0x90/0x170
? srso_return_thunk+0x5/0x5f
? __count_memcg_events+0x4d/0xc0
? srso_return_thunk+0x5/0x5f
? count_memcg_events.constprop.0+0x1a/0x30
? srso_return_thunk+0x5/0x5f
? handle_mm_fault+0xa2/0x360
? srso_return_thunk+0x5/0x5f
? do_user_addr_fault+0x304/0x670
? srso_return_thunk+0x5/0x5f
? srso_return_thunk+0x5/0x5f
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x71db3280e705
Code: 8b 55 d0 48 89 45 b0 75 a0 44 89 55 9c e8 93 8a f8 ff 44 8b 55 9c 89 da 4c 89 e6 >
RSP: 002b:000071db175fd3e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000080241 RCX: 000071db3280e705
RDX: 0000000000080241 RSI: 000071db175fd4d8 RDI: 00000000ffffff9c
RBP: 000071db175fd450 R08: 0000000000000000 R09: 010f0f010f0f0f07
R10: 00000000000001b6 R11: 0000000000000293 R12: 000071db175fd4d8
R13: 000071db175fd880 R14: 000071db175fd4d8 R15: 00000000000001b6
</TASK>
---[ end trace 0000000000000000 ]---
On my NAS I have the following setup:
I guess this is one of those setups that you might be thinking, oh dear, what kind of users am I attracting here :). Anyways, this setup performs like a madman, up until the point it crashes/hangs:
This only happens when there is intense writing (and I'm guessing when the data begins to overflow to the hdd target).