koverstreet / bcachefs

Other
662 stars 70 forks source link

corrupt btree node before write at btree extents level #716

Open Lykos153 opened 1 month ago

Lykos153 commented 1 month ago

I'm getting the feeling that I'm spamming issues right now. Sorry about that. If there's a more appropriate place to ask about this, I'd be happy to be pointed to it, but as searching the Internet for the string "corrupt btree node before write at btree extents level" yielded zero results, I decided to open another issue.

I just got

[63048.843482] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): corrupt btree node before write at btree extents level 0/2
[63048.843498] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): inconsistency detected - emergency read only at journal seq 2860801
[63048.843517]  validate_bset_for_write+0xc0/0x150 [bcachefs]
[63048.843600]  __bch2_btree_node_write+0xb84/0xd20 [bcachefs]
[63048.843653]  bch2_btree_node_write+0x5d/0x130 [bcachefs]
[63048.843693]  __btree_node_flush+0xf2/0x140 [bcachefs]
[63048.843736]  ? __pfx_bch2_btree_node_flush0+0x10/0x10 [bcachefs]
[63048.843779]  journal_flush_pins.constprop.0+0x1ad/0x2c0 [bcachefs]
[63048.843835]  __bch2_journal_reclaim+0x1db/0x370 [bcachefs]
[63048.843889]  bch2_journal_reclaim_thread+0x6e/0x160 [bcachefs]
[63048.843952]  ? __pfx_bch2_journal_reclaim_thread+0x10/0x10 [bcachefs]
[63048.923030] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): unshutdown complete, journal seq 2860801

while copying lots of stuff onto the file system.

bcachefs fs usage -h Filesystem: 677cf0a7-1abe-4ce3-876c-2ca63301229d Size: 8.80 TiB Used: 5.98 TiB Online reserved: 1.84 MiB Data type Required/total Durability Devices reserved: 1/0 [] 91.7 MiB btree: 1/3 3 [sde1 nvme0n1p2 sdf1] 30.0 MiB btree: 1/3 3 [sdc1 sde1 sdh1] 1.65 GiB btree: 1/2 2 [sdc1 sdf1] 8.22 GiB btree: 1/3 3 [sdc1 nvme0n1p2 sdf1] 30.0 MiB btree: 1/2 2 [sdc1 nvme0n1p2] 8.42 GiB btree: 1/3 3 [sdc1 sdd1 nvme0n1p2] 2.19 GiB btree: 1/3 3 [sde1 sdd1 nvme0n1p2] 2.14 GiB btree: 1/3 3 [sdd1 nvme0n1p2 sdf1] 48.9 GiB btree: 1/2 2 [sdc1 sde1] 5.07 GiB btree: 1/2 2 [sde1 nvme0n1p2] 7.92 GiB btree: 1/2 2 [sde1 sdf1] 7.75 GiB user: 1/1 1 [sdf1] 3.29 GiB user: 1/1 1 [sdh1] 2.60 TiB user: 1/1 1 [sdc1] 91.4 GiB user: 1/1 1 [nvme0n1p2] 413 GiB user: 1/1 1 [sde1] 85.4 GiB user: 1/1 1 [sdd1] 43.8 GiB user: 1/1 1 [sdb1] 2.64 TiB cached: 1/1 1 [sdd1] 179 GiB cached: 1/1 1 [sde1] 2.76 GiB cached: 1/1 1 [sdc1] 2.81 GiB cached: 1/1 1 [sdh1] 1.68 GiB cached: 1/1 1 [nvme0n1p2] 229 GiB cached: 1/1 1 [sdf1] 196 GiB hdd.hdd1 (device 2): sdh1 ro data buckets fragmented free: 125 GiB 511131 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 562 MiB 2247 user: 2.60 TiB 10913266 cached: 1.68 GiB 11500 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 2.73 TiB 11446349 hdd.hdd2 (device 5): sdb1 rw data buckets fragmented free: 86.5 GiB 88527 sb: 3.00 MiB 4 1020 KiB journal: 8.00 GiB 8192 btree: 0 B 0 user: 2.64 TiB 2764864 3.11 MiB cached: 0 B 0 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 2.73 TiB 2861587 hdd.hdd3 (device 3): sdd1 rw data buckets fragmented free: 2.48 TiB 2604193 sb: 3.00 MiB 4 1020 KiB journal: 8.00 GiB 8192 btree: 17.8 GiB 18193 11.5 MiB user: 43.8 GiB 44913 57.3 MiB cached: 179 GiB 186081 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 11.0 MiB 11 capacity: 2.73 TiB 2861587 ssd.ssd1 (device 0): sdc1 rw data buckets fragmented free: 12.0 GiB 49165 sb: 3.00 MiB 13 252 KiB journal: 954 MiB 3815 btree: 12.1 GiB 49740 user: 91.4 GiB 374544 76.0 KiB cached: 2.63 GiB 11136 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 119 GiB 488413 ssd.ssd2 (device 1): sde1 rw data buckets fragmented free: 11.2 GiB 46041 sb: 3.00 MiB 13 252 KiB journal: 894 MiB 3577 btree: 11.6 GiB 47684 user: 85.4 GiB 349705 176 KiB cached: 2.59 GiB 10869 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 112 GiB 457889 ssd.ssd4 (device 4): nvme0n1p2 rw data buckets fragmented free: 279 GiB 572371 sb: 3.00 MiB 7 508 KiB journal: 3.91 GiB 8000 btree: 25.9 GiB 53152 9.25 MiB user: 413 GiB 846468 5.83 MiB cached: 229 GiB 472406 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 9.00 MiB 18 capacity: 953 GiB 1952422 ssd.ssd5 (device 6): sdf1 rw data buckets fragmented free: 9.61 GiB 19688 sb: 3.00 MiB 7 508 KiB journal: 1.82 GiB 3726 btree: 24.3 GiB 49815 8.75 MiB user: 3.29 GiB 6741 2.29 MiB cached: 193 GiB 396952 parity: 0 B 0 stripe: 0 B 0 need_gc_gens: 0 B 0 need_discard: 9.50 MiB 19 capacity: 233 GiB 476948

I also ran

bcachefs fsck -ny ``` bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_alloc_info... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_lrus...incorrect lru entry: lru read time 9034497496 u64s 5 type set 844433964629464:844424930554135:0 len 0 ver 0 for u64s 5 type deleted 3:422167:0 len 0 ver 0, not fixing incorrect lru entry: lru read time 10267875312 u64s 5 type set 844435198007280:844424930553922:0 len 0 ver 0 for u64s 5 type deleted 3:421954:0 len 0 ver 0, not fixing done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_btree_backpointers... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_backpointers_to_extents... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_extents_to_backpointers...missing backpointer for btree=extents l=1 u64s 13 type btree_ptr_v2 5700043:665048:4294967294 len 0 ver 0: seq 39c8f0667d77f7c written 344 min_key 5700043:275480:U32_MAX durability: 3 ptr: 3:257318:512 gen 0 ptr: 4:1382542:512 gen 0 ptr: 6:241449:0 gen 1 got: u64s 5 type deleted 4:1449700884480:0 len 0 ver 0 want: u64s 9 type backpointer 4:1449700884480:0 len 0 ver 0: bucket=4:1382542:0 btree=extents l=1 offset=512:0 len=512 pos=5700043:665048:4294967294: fix? (y,n, or Y,N for all errors of this type) done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_alloc_to_lru_refs... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_snapshot_trees... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_snapshots... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvols... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvol_children... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): delete_dead_snapshots... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_root... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvolume_structure... done bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_directory_structure...unreachable inode u64s 16 type inode_v3 0:2903267:4294967293 len 0 ver 0: mode=100444 flags= (7300000) journal_seq=2666230 bi_size=260 bi_sectors=8 bi_version=0 bi_atime=1912211233956985 bi_ctime=2425523466637133 bi_mtime=16727570274002083825 bi_otime=1912189288589882 bi_uid=0 bi_gid=0 bi_nlink=1 bi_generation=0 bi_dev=0 bi_data_checksum=0 bi_compression=0 bi_project=0 bi_background_compression=0 bi_data_replicas=0 bi_promote_target=0 bi_foreground_target=0 bi_background_target=0 bi_erasure_code=0 bi_fields_set=0 bi_dir=0 bi_dir_offset=0 bi_subvol=0 bi_parent_subvol=0 bi_nocow=0 : fix? (y,n, or Y,N for all errors of this type) done ```

though I'm unsure about what it tells me and - coming from btrfs - I'm very hesistant to let fsck touch anything. Should I just do fix_errors?

koverstreet commented 1 month ago

fsck is quite safe on bcachefs

sorry I'm slow getting to these, and I'm about to be offline for a week - but I'll try to get to these soon :)

it does look like you trimmed the important part of the log though - i.e. why the btree node was corrupt

Lykos153 commented 1 month ago

Dammit :/ Apparently dmesg | grep bcachefs is not a good idea. Well the log is gone now, with the fs read-only and all. But after a reboot it mounted successfully read-write again (without -o fsck,fix_errors). Not sure if it's interesting, but this is what I got during boot:

``` [ 14.832562] stage-1-init: [Mon Jul 22 17:14:24 UTC 2024] enter passphrase for /bcachefs: unlocking successful. [ 14.843870] stage-1-init: [Mon Jul 22 17:14:24 UTC 2024] mounting none on /... [ 14.856720] stage-1-init: [Mon Jul 22 17:14:24 UTC 2024] mounting UUID=677cf0a7-1abe-4ce3-876c-2ca63301229d on /bcachefs... [ 14.912293] raid6: avx2x4 gen() 33908 MB/s [ 14.929293] raid6: avx2x2 gen() 34186 MB/s [ 14.946293] raid6: avx2x1 gen() 26595 MB/s [ 14.946294] raid6: using algorithm avx2x2 gen() 34186 MB/s [ 14.963293] raid6: .... xor() 20548 MB/s, rmw enabled [ 14.963294] raid6: using avx2x2 recovery algorithm [ 14.964045] xor: automatically using best checksumming function avx [ 15.096377] bcachefs (UUID=677cf0a7-1abe-4ce3-876c-2ca63301229d): error reading superblock: error opening UUID=677cf0a7-1abe-4ce3-876c-2ca63301229d: ENOENT [ 15.144256] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): mounting version 1.7: mi_btree_bitmap opts=metadata_replicas=3,compression=zstd,foreground_target=ssd,background_target=hdd,promote_target=ssd [ 15.144262] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): recovering from unclean shutdown [ 72.636741] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): journal read done, replaying entries 2860195-2860797 [ 72.636746] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): dropped unflushed entries 2860798-2860801 [ 74.887888] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): alloc_read... done [ 75.033780] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): stripes_read... done [ 75.033791] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): snapshots_read... done [ 75.220299] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): going read-write [ 75.221666] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): journal_replay... done [ 85.567513] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): resume_logged_ops... done [ 85.567521] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): delete_dead_inodes... done [ 85.605782] stage-1-init: [Mon Jul 22 17:15:35 UTC 2024] mounting /mnt-root/bcachefs/nix on /nix... [...] [ 97.198653] ------------[ cut here ]------------ [ 97.198657] btree trans held srcu lock (delaying memory reclaim) for 10 seconds [ 97.198679] WARNING: CPU: 2 PID: 505 at fs/bcachefs/btree_iter.c:2871 bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 97.198795] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs libcrc32c crc32c_generic lz4_compress lz4hc_compress xor raid6_pq hid_generic usbhid hid usb_storage sd_mod i915 xhci_pci ahci xhci_pci_renesas libahci nvme nvme_core libata e1000e xhci_hcd ehci_pci i2c_algo_bit drm_buddy ttm intel_gtt nvme_auth t10_pi ehci_hcd drm_display_helper scsi_mod crc32c_intel sha256_ssse3 firmware_class crc64_rocksoft crc_t10dif ptp crct10dif_generic crct10dif_pclmul cec crc64 crct10dif_common pps_core scsi_common rtc_cmos video wmi backlight dm_snapshot dm_bufio dm_mod dax [ 97.198854] CPU: 2 PID: 505 Comm: perl Not tainted 6.9.9 #1-NixOS [ 97.198858] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme4, BIOS P3.50 03/11/2018 [ 97.198860] RIP: 0010:bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 97.198959] Code: 5a de 48 c7 c7 d0 65 db c0 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 c6 5c a5 dc 90 <0f> 0b 90 90 e9 5f ff ff ff 90 0f 0b 90 e9 6c ff ff ff 0f 1f 00 90 [ 97.198962] RSP: 0018:ffffbe8880bf39d0 EFLAGS: 00010282 [ 97.198966] RAX: 0000000000000000 RBX: ffffa0368ed58000 RCX: c0000000ffffdfff [ 97.198968] RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 97.198970] RBP: ffffa0368d9c0000 R08: 0000000000000000 R09: 0000000000000003 [ 97.198972] R10: ffffbe8880bf3878 R11: ffffffff9f33a128 R12: ffffa0368ed58478 [ 97.198974] R13: ffffa0368ed58000 R14: 0000000000000006 R15: ffffa0368ed58478 [ 97.198976] FS: 00007f2b72b9b740(0000) GS:ffffa03b97700000(0000) knlGS:0000000000000000 [ 97.198979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 97.198981] CR2: 00007f2b731050e8 CR3: 00000001178dc001 CR4: 00000000001706f0 [ 97.198984] Call Trace: [ 97.198988] [ 97.198991] ? __warn+0x80/0x120 [ 97.198997] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 97.199095] ? report_bug+0x164/0x190 [ 97.199100] ? handle_bug+0x3d/0x80 [ 97.199106] ? exc_invalid_op+0x17/0x70 [ 97.199112] ? asm_exc_invalid_op+0x1a/0x20 [ 97.199118] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 97.199212] ? bch2_trans_begin+0xf8/0x600 [bcachefs] [ 97.199312] bch2_trans_begin+0x5a5/0x600 [bcachefs] [ 97.199408] ? __bch2_create+0x354/0x5c0 [bcachefs] [ 97.199527] __bch2_create+0x197/0x5c0 [bcachefs] [ 97.199648] ? bch2_create+0x2a/0x60 [bcachefs] [ 97.199764] bch2_create+0x2a/0x60 [bcachefs] [ 97.199879] path_openat+0xe8d/0x1150 [ 97.199886] do_filp_open+0xc4/0x170 [ 97.199893] do_sys_openat2+0xab/0xe0 [ 97.199899] __x64_sys_openat+0x57/0xa0 [ 97.199903] do_syscall_64+0xb8/0x200 [ 97.199908] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 97.199912] RIP: 0033:0x7f2b72c9a2b2 [ 97.199924] Code: 83 e2 40 75 53 89 f0 f7 d0 a9 00 00 41 00 74 48 80 3d a1 9d 0e 00 00 74 6c 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 92 00 00 00 48 8b 54 24 28 64 48 2b 14 25 [ 97.199927] RSP: 002b:00007ffdaabee860 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 [ 97.199931] RAX: ffffffffffffffda RBX: 0000000000080041 RCX: 00007f2b72c9a2b2 [ 97.199933] RDX: 0000000000080041 RSI: 000000000a73b100 RDI: 00000000ffffff9c [ 97.199935] RBP: 000000000a73b100 R08: 00007ffdaabeeae0 R09: 00000000ffffffff [ 97.199937] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000000 [ 97.199939] R13: 000000000a73b100 R14: 000000000a727ec0 R15: 0000000000000000 [ 97.199943] [ 97.199944] ---[ end trace 0000000000000000 ]--- [ 115.726021] ------------[ cut here ]------------ [ 115.726026] btree trans held srcu lock (delaying memory reclaim) for 10 seconds [ 115.726043] WARNING: CPU: 0 PID: 505 at fs/bcachefs/btree_iter.c:2871 bch2_trans_put+0x23e/0x270 [bcachefs] [ 115.726137] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs libcrc32c crc32c_generic lz4_compress lz4hc_compress xor raid6_pq hid_generic usbhid hid usb_storage sd_mod i915 xhci_pci ahci xhci_pci_renesas libahci nvme nvme_core libata e1000e xhci_hcd ehci_pci i2c_algo_bit drm_buddy ttm intel_gtt nvme_auth t10_pi ehci_hcd drm_display_helper scsi_mod crc32c_intel sha256_ssse3 firmware_class crc64_rocksoft crc_t10dif ptp crct10dif_generic crct10dif_pclmul cec crc64 crct10dif_common pps_core scsi_common rtc_cmos video wmi backlight dm_snapshot dm_bufio dm_mod dax [ 115.726186] CPU: 0 PID: 505 Comm: perl Tainted: G W 6.9.9 #1-NixOS [ 115.726189] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme4, BIOS P3.50 03/11/2018 [ 115.726191] RIP: 0010:bch2_trans_put+0x23e/0x270 [bcachefs] [ 115.726281] Code: 5a de 48 c7 c7 d0 65 db c0 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 f3 4f a5 dc 90 <0f> 0b 90 90 8b b5 a8 00 00 00 49 8d be 68 36 00 00 83 fe 01 77 0a [ 115.726284] RSP: 0018:ffffbe8880bf3a40 EFLAGS: 00010282 [ 115.726287] RAX: 0000000000000000 RBX: ffffa0368e8d5270 RCX: c0000000ffffdfff [ 115.726289] RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 115.726291] RBP: ffffa0368ed58000 R08: 0000000000000000 R09: 0000000000000003 [ 115.726293] R10: ffffbe8880bf38e8 R11: ffffffff9f33a128 R12: ffffa0368ed58000 [ 115.726294] R13: ffffbe8880bf3b80 R14: ffffa0368d9c0000 R15: ffffa0368e8d5270 [ 115.726296] FS: 00007f2b72b9b740(0000) GS:ffffa03b97600000(0000) knlGS:0000000000000000 [ 115.726299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.726301] CR2: 000000000aeba058 CR3: 00000001178dc003 CR4: 00000000001706f0 [ 115.726303] Call Trace: [ 115.726305] [ 115.726308] ? __warn+0x80/0x120 [ 115.726313] ? bch2_trans_put+0x23e/0x270 [bcachefs] [ 115.726402] ? report_bug+0x164/0x190 [ 115.726406] ? handle_bug+0x3d/0x80 [ 115.726412] ? exc_invalid_op+0x17/0x70 [ 115.726416] ? asm_exc_invalid_op+0x1a/0x20 [ 115.726421] ? bch2_trans_put+0x23e/0x270 [bcachefs] [ 115.726509] __bch2_create+0x4d5/0x5c0 [bcachefs] [ 115.726618] ? bch2_create+0x2a/0x60 [bcachefs] [ 115.726723] bch2_create+0x2a/0x60 [bcachefs] [ 115.726828] path_openat+0xe8d/0x1150 [ 115.726834] do_filp_open+0xc4/0x170 [ 115.726840] do_sys_openat2+0xab/0xe0 [ 115.726845] __x64_sys_openat+0x57/0xa0 [ 115.726848] do_syscall_64+0xb8/0x200 [ 115.726853] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 115.726856] RIP: 0033:0x7f2b72c9a2b2 [ 115.726866] Code: 83 e2 40 75 53 89 f0 f7 d0 a9 00 00 41 00 74 48 80 3d a1 9d 0e 00 00 74 6c 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 92 00 00 00 48 8b 54 24 28 64 48 2b 14 25 [ 115.726868] RSP: 002b:00007ffdaabee860 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 [ 115.726872] RAX: ffffffffffffffda RBX: 0000000000080041 RCX: 00007f2b72c9a2b2 [ 115.726874] RDX: 0000000000080041 RSI: 000000000aeb88f0 RDI: 00000000ffffff9c [ 115.726875] RBP: 000000000aeb88f0 R08: 00007ffdaabeeae0 R09: 00000000ffffffff [ 115.726877] R10: 00000000000001a4 R11: 0000000000000202 R12: 0000000000000000 [ 115.726879] R13: 000000000aeb88f0 R14: 000000000a727ec0 R15: 0000000000000000 [ 115.726882] [ 115.726883] ---[ end trace 0000000000000000 ]--- ```
Lykos153 commented 1 month ago

Ok, maybe it came back. But it looks different, so maybe it's unrelated... It happened again during mass-copying stuff onto the fs. I hope this time I didn't accidentally crop the log.

``` [18344.587199] about to insert invalid key in data update path [18344.587201] old: u64s 9 type extent 1079454783:3496:4294967294 len 24 ver 159243733: durability: 1 crc: c_size 16 size 24 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 4:532946:624 gen 13 rebalance: target hdd compression zstd [18344.587202] k: u64s 9 type extent 1079454783:3496:4294967294 len 8 ver 159243734: durability: 1 crc: c_size 8 size 8 offset 0 nonce 0 csum chacha20_poly1305_80 compress incompressible ptr: 4:532946:640 gen 13 rebalance: target hdd compression zstd [18344.587203] new: u64s 12 type extent 1079454783:3496:4294967294 len 8 ver 159243734: durability: 1 crc: c_size 8 size 8 offset 0 nonce 0 csum chacha20_poly1305_80 compress incompressible ptr: 4:532946:640 gen 13 cached rebalance: target hdd compression zstd crc: c_size 16 size 24 offset 16 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:266223:416 gen 1 [18344.587216] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): fatal error - emergency read only [18344.587274] ------------[ cut here ]------------ [18344.587275] kernel BUG at fs/bcachefs/io_write.c:532! [18344.587281] invalid opcode: 0000 [#1] PREEMPT SMP PTI [18344.587283] CPU: 2 PID: 54886 Comm: kworker/u17:4 Tainted: G W 6.9.9 #1-NixOS [18344.587285] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme4, BIOS P3.50 03/11/2018 [18344.587287] Workqueue: bcachefs bch2_write_point_do_index_updates [bcachefs] [18344.587362] RIP: 0010:__bch2_write_index+0x281/0x290 [bcachefs] [18344.587418] Code: 03 00 5a e9 77 ff ff ff be 1e 00 00 00 44 89 e7 e8 44 42 fc ff 84 c0 0f 85 62 ff ff ff 4d 8b b5 38 01 00 00 e9 3f ff ff ff 90 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 [18344.587420] RSP: 0018:ffffbe888a1abe10 EFLAGS: 00010246 [18344.587422] RAX: ffffa0389742fc20 RBX: ffffa0368d9c0000 RCX: 0000000000000018 [18344.587423] RDX: ffffa0389742fc20 RSI: 0000000000000008 RDI: 0048638a39a0ffff [18344.587424] RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000003 [18344.587425] R10: ffffbe888a1abdf8 R11: ffffffff9f33a128 R12: 0000000000000000 [18344.587426] R13: ffffa0389742fa98 R14: ffffa0389742fbe0 R15: dead000000000100 [18344.587428] FS: 0000000000000000(0000) GS:ffffa03b97700000(0000) knlGS:0000000000000000 [18344.587429] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [18344.587430] CR2: 00000042005d6000 CR3: 0000000021820003 CR4: 00000000001706f0 [18344.587432] Call Trace: [18344.587434] [18344.587437] ? die+0x36/0x90 [18344.587441] ? do_trap+0xdd/0x100 [18344.587444] ? __bch2_write_index+0x281/0x290 [bcachefs] [18344.587498] ? do_error_trap+0x6a/0x90 [18344.587500] ? __bch2_write_index+0x281/0x290 [bcachefs] [18344.587553] ? exc_invalid_op+0x51/0x70 [18344.587557] ? __bch2_write_index+0x281/0x290 [bcachefs] [18344.587609] ? asm_exc_invalid_op+0x1a/0x20 [18344.587613] ? __bch2_write_index+0x281/0x290 [bcachefs] [18344.587694] bch2_write_point_do_index_updates+0xb1/0x160 [bcachefs] [18344.587757] process_one_work+0x183/0x3a0 [18344.587761] worker_thread+0x245/0x350 [18344.587765] ? __pfx_worker_thread+0x10/0x10 [18344.587767] kthread+0xd0/0x100 [18344.587771] ? __pfx_kthread+0x10/0x10 [18344.587773] ret_from_fork+0x34/0x50 [18344.587776] ? __pfx_kthread+0x10/0x10 [18344.587778] ret_from_fork_asm+0x1a/0x30 [18344.587783] [18344.587784] Modules linked in: ext4 mbcache jbd2 rfcomm nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype overlay ccm xt_CHECKSUM xt_MASQUERADE ipt_REJECT nf_reject_ipv4 nft_chain_nat uhid cmac algif_hash algif_skcipher af_alg bnep msr btrfs blake2b_generic dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core af_packet cmdlinepart spi_nor mtd intel_rapl_msr at24 mei_hdcp iTCO_wdt intel_pmc_bxt spi_intel_platform spi_intel mei_pxp watchdog xt_conntrack iwlmvm ip6t_rpfilter ipt_rpfilter mac80211 intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crc32_pclmul polyval_clmulni xt_pkttype libarc4 polyval_generic gf128mul snd_hda_codec_realtek ghash_clmulni_intel xt_LOG nf_log_syslog snd_hda_codec_generic sha512_ssse3 sha1_ssse3 snd_hda_scodec_component snd_hda_codec_hdmi aesni_intel snd_hda_intel xt_tcpudp crypto_simd nft_compat snd_intel_dspcfg cryptd iwlwifi snd_intel_sdw_acpi uvcvideo rapl snd_hda_codec intel_cstate nf_tables btusb snd_usb_audio intel_uncore cfg80211 btrtl btintel [18344.587822] snd_usbmidi_lib i2c_i801 btbcm btmtk mxm_wmi i2c_smbus snd_hda_core snd_ump bluetooth snd_rawmidi snd_seq_device mei_me lpc_ich snd_hwdep sch_fq_codel snd_pcm uvc uinput atkbd libps2 serio gspca_vc032x mei vivaldi_fmap ledtrig_audio gspca_main ecdh_generic videobuf2_vmalloc videobuf2_memops nls_iso8859_1 videobuf2_v4l2 nls_cp437 vfat videodev rfkill fat snd_timer ecc videobuf2_common mc loop crc16 pl2303 cpufreq_ondemand hid_jabra snd xt_nat nf_nat nf_conntrack soundcore nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter veth bridge tiny_power_button button stp llc uas input_leds tun led_class joydev mousedev evdev mac_hid kvm_intel kvm vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd fuse efi_pstore configfs nfnetlink zram efivarfs dmi_sysfs ip_tables x_tables autofs4 poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs libcrc32c crc32c_generic lz4_compress lz4hc_compress xor raid6_pq hid_generic usbhid hid usb_storage sd_mod i915 xhci_pci ahci xhci_pci_renesas libahci [18344.587866] nvme nvme_core libata e1000e xhci_hcd ehci_pci i2c_algo_bit drm_buddy ttm intel_gtt nvme_auth t10_pi ehci_hcd drm_display_helper scsi_mod crc32c_intel sha256_ssse3 firmware_class crc64_rocksoft crc_t10dif ptp crct10dif_generic crct10dif_pclmul cec crc64 crct10dif_common pps_core scsi_common rtc_cmos video wmi backlight dm_snapshot dm_bufio dm_mod dax [18344.587903] ---[ end trace 0000000000000000 ]--- [18345.166682] RIP: 0010:__bch2_write_index+0x281/0x290 [bcachefs] [18345.166764] Code: 03 00 5a e9 77 ff ff ff be 1e 00 00 00 44 89 e7 e8 44 42 fc ff 84 c0 0f 85 62 ff ff ff 4d 8b b5 38 01 00 00 e9 3f ff ff ff 90 <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 [18345.166766] RSP: 0018:ffffbe888a1abe10 EFLAGS: 00010246 [18345.166769] RAX: ffffa0389742fc20 RBX: ffffa0368d9c0000 RCX: 0000000000000018 [18345.166770] RDX: ffffa0389742fc20 RSI: 0000000000000008 RDI: 0048638a39a0ffff [18345.166772] RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000003 [18345.166773] R10: ffffbe888a1abdf8 R11: ffffffff9f33a128 R12: 0000000000000000 [18345.166774] R13: ffffa0389742fa98 R14: ffffa0389742fbe0 R15: dead000000000100 [18345.166775] FS: 0000000000000000(0000) GS:ffffa03b97700000(0000) knlGS:0000000000000000 [18345.166777] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [18345.166778] CR2: 00000042005d6000 CR3: 000000038cc02002 CR4: 00000000001706f0 ```

EDIT: Someone opened #717 with the same error so I guess this one is separate from the corrupt btree node before write at btree extents level

Lykos153 commented 1 month ago

I'm beginning to suspect a faulty disk. Though I don't know how I would find out which one it is. S.M.A.R.T looks fine (except for the one I labeled ro as described in #715, but with it being read-only I guess it won't cause issues when writing).

Lykos153 commented 1 month ago

This time I mounted with fsck,fix_errors. FWIW, here's the log:

``` [ 8.981541] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): mounting version 1.7: mi_btree_bitmap opts=metadata_replicas=3,compression=zstd,foreground_target=ssd,background_target=hdd,promote_target=ssd,fsck,fix_errors=yes [ 8.981570] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): recovering from clean shutdown, journal seq 3020590 [ 66.488259] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): journal read done, replaying entries 3020590-3020590 [ 66.775046] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): alloc_read... done [ 66.932810] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): stripes_read... done [ 66.932817] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): snapshots_read... done [ 66.932832] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_allocations... [ 245.258324] fs has wrong cached: got 1342220432, should be 1335535621, fixing [ 245.258352] fs has wrong nr_inodes: got 5986459, should be 5986353, fixing [ 245.258371] fs has wrong cached: 1/1 [1]: got 7031589, should be 6668573, fixing [ 245.258389] fs has wrong cached: 1/1 [0]: got 7195848, should be 6832840, fixing [ 245.258403] fs has wrong cached: 1/1 [6]: got 416549076, should be 410590289, fixing [ 269.458788] done [ 269.522581] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): going read-write [ 269.755203] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): journal_replay... done [ 269.755209] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_alloc_info... done [ 284.718036] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_lrus... [ 284.789993] incorrect lru entry: lru read time 9034497496 [ 284.789995] u64s 5 type set 844433964629464:844424930554135:0 len 0 ver 0 [ 284.789996] for u64s 5 type deleted 3:422167:0 len 0 ver 0, fixing [ 284.790077] incorrect lru entry: lru read time 10267875312 [ 284.790077] u64s 5 type set 844435198007280:844424930553922:0 len 0 ver 0 [ 284.790078] for u64s 5 type deleted 3:421954:0 len 0 ver 0, fixing [ 286.000966] done [ 286.000969] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_btree_backpointers... done [ 497.511763] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_backpointers_to_extents... [ 512.596897] ------------[ cut here ]------------ [ 512.596900] btree trans held srcu lock (delaying memory reclaim) for 15 seconds [ 512.596914] WARNING: CPU: 3 PID: 261 at fs/bcachefs/btree_iter.c:2871 bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 512.596973] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs libcrc32c crc32c_generic lz4_compress lz4hc_compress xor raid6_pq hid_generic usbhid hid usb_storage sd_mod i915 nvme i2c_algo_bit drm_buddy ahci e1000e libahci nvme_core ttm xhci_pci nvme_auth libata intel_gtt xhci_pci_renesas t10_pi drm_display_helper xhci_hcd scsi_mod ehci_pci ehci_hcd firmware_class ptp cec pps_core crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_pclmul crct10dif_common crc32c_intel sha256_ssse3 scsi_common rtc_cmos video wmi backlight dm_snapshot dm_bufio dm_mod dax [ 512.597001] CPU: 3 PID: 261 Comm: mount.bcachefs Not tainted 6.9.9 #1-NixOS [ 512.597002] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme4, BIOS P3.50 03/11/2018 [ 512.597003] RIP: 0010:bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 512.597044] Code: 6d ea 48 c7 c7 d0 a5 c8 c0 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 c6 1c b8 e8 90 <0f> 0b 90 90 e9 5f ff ff ff 90 0f 0b 90 e9 6c ff ff ff 0f 1f 00 90 [ 512.597045] RSP: 0018:ffff9fdc00807780 EFLAGS: 00010286 [ 512.597047] RAX: 0000000000000000 RBX: ffff957ec8e80000 RCX: c0000000ffffdfff [ 512.597048] RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 512.597049] RBP: ffff957ecf7c0000 R08: 0000000000000000 R09: 0000000000000003 [ 512.597050] R10: ffff9fdc00807628 R11: ffffffffab33a128 R12: 0000000000000001 [ 512.597051] R13: ffff957ec8e80000 R14: 0000000000000000 R15: ffff957ec8e80000 [ 512.597052] FS: 00007f1292426080(0000) GS:ffff9583d7780000(0000) knlGS:0000000000000000 [ 512.597053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 512.597054] CR2: 000055d6cbe9d000 CR3: 000000010d530004 CR4: 00000000001706f0 [ 512.597055] Call Trace: [ 512.597057] [ 512.597059] ? __warn+0x80/0x120 [ 512.597062] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 512.597102] ? report_bug+0x164/0x190 [ 512.597105] ? handle_bug+0x3d/0x80 [ 512.597108] ? exc_invalid_op+0x17/0x70 [ 512.597111] ? asm_exc_invalid_op+0x1a/0x20 [ 512.597113] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 512.597153] ? bch2_trans_begin+0xf8/0x600 [bcachefs] [ 512.597192] bch2_trans_begin+0x5a5/0x600 [bcachefs] [ 512.597232] bch2_btree_iter_peek_node_and_restart+0x44/0x50 [bcachefs] [ 512.597276] bch2_get_btree_in_memory_pos+0x1d0/0x2e0 [bcachefs] [ 512.597312] bch2_check_backpointers_to_extents+0xa8/0x5d0 [bcachefs] [ 512.597348] ? __bch2_print+0x87/0xe0 [bcachefs] [ 512.597398] bch2_run_recovery_pass+0x38/0xa0 [bcachefs] [ 512.597449] bch2_run_recovery_passes+0xb6/0x180 [bcachefs] [ 512.597499] bch2_fs_recovery+0xc00/0x1330 [bcachefs] [ 512.597548] ? vprintk_emit+0xca/0x270 [ 512.597551] ? __bch2_print+0x87/0xe0 [bcachefs] [ 512.597597] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 512.597647] ? print_mount_opts+0x131/0x180 [bcachefs] [ 512.597692] ? bch2_recalc_capacity+0x106/0x370 [bcachefs] [ 512.597727] bch2_fs_start+0x2f6/0x460 [bcachefs] [ 512.597773] bch2_fs_open+0x6b7/0x6d0 [bcachefs] [ 512.597819] bch2_mount+0x5bd/0x790 [bcachefs] [ 512.597869] legacy_get_tree+0x2b/0x50 [ 512.597873] vfs_get_tree+0x29/0xe0 [ 512.597876] path_mount+0x4ca/0xb10 [ 512.597878] __x64_sys_mount+0x11a/0x150 [ 512.597880] do_syscall_64+0xb8/0x200 [ 512.597883] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 512.597885] RIP: 0033:0x7f129254d11e [ 512.597891] Code: 48 8b 0d fd 2c 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ca 2c 0d 00 f7 d8 64 89 01 48 [ 512.597892] RSP: 002b:00007fff1a971168 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 512.597894] RAX: ffffffffffffffda RBX: 0000559a5e7dbca0 RCX: 00007f129254d11e [ 512.597895] RDX: 0000559a5e7dbca0 RSI: 0000559a5e827210 RDI: 0000559a5e7e0f80 [ 512.597896] RBP: 8000000000000000 R08: 0000559a5e7e5680 R09: 0000000000000000 [ 512.597897] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000020 [ 512.597898] R13: 0000559a5e7e5680 R14: 0000559a5e827210 R15: 0000000000000009 [ 512.597899] [ 512.597900] ---[ end trace 0000000000000000 ]--- [ 746.496455] ------------[ cut here ]------------ [ 746.496469] btree trans held srcu lock (delaying memory reclaim) for 11 seconds [ 746.496496] WARNING: CPU: 3 PID: 261 at fs/bcachefs/btree_iter.c:2871 bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 746.496553] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs libcrc32c crc32c_generic lz4_compress lz4hc_compress xor raid6_pq hid_generic usbhid hid usb_storage sd_mod i915 nvme i2c_algo_bit drm_buddy ahci e1000e libahci nvme_core ttm xhci_pci nvme_auth libata intel_gtt xhci_pci_renesas t10_pi drm_display_helper xhci_hcd scsi_mod ehci_pci ehci_hcd firmware_class ptp cec pps_core crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_pclmul crct10dif_common crc32c_intel sha256_ssse3 scsi_common rtc_cmos video wmi backlight dm_snapshot dm_bufio dm_mod dax [ 746.496582] CPU: 3 PID: 261 Comm: mount.bcachefs Tainted: G W 6.9.9 #1-NixOS [ 746.496583] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Extreme4, BIOS P3.50 03/11/2018 [ 746.496585] RIP: 0010:bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 746.496627] Code: 6d ea 48 c7 c7 d0 a5 c8 c0 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 c6 1c b8 e8 90 <0f> 0b 90 90 e9 5f ff ff ff 90 0f 0b 90 e9 6c ff ff ff 0f 1f 00 90 [ 746.496628] RSP: 0018:ffff9fdc00807780 EFLAGS: 00010286 [ 746.496630] RAX: 0000000000000000 RBX: ffff957ec8e80000 RCX: c0000000ffffdfff [ 746.496631] RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: 0000000000000001 [ 746.496632] RBP: ffff957ecf7c0000 R08: 0000000000000000 R09: 0000000000000003 [ 746.496633] R10: ffff9fdc00807628 R11: ffffffffab33a128 R12: ffff957ec8e804f8 [ 746.496634] R13: ffff957ec8e80000 R14: 0000000000000007 R15: ffff957ec8e804f8 [ 746.496635] FS: 00007f1292426080(0000) GS:ffff9583d7780000(0000) knlGS:0000000000000000 [ 746.496636] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 746.496637] CR2: 000055d6cbe9d000 CR3: 000000010d530003 CR4: 00000000001706f0 [ 746.496638] Call Trace: [ 746.496640] [ 746.496642] ? __warn+0x80/0x120 [ 746.496646] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 746.496686] ? report_bug+0x164/0x190 [ 746.496689] ? handle_bug+0x3d/0x80 [ 746.496693] ? exc_invalid_op+0x17/0x70 [ 746.496695] ? asm_exc_invalid_op+0x1a/0x20 [ 746.496698] ? bch2_trans_srcu_unlock+0x11b/0x130 [bcachefs] [ 746.496737] ? bch2_trans_begin+0xf8/0x600 [bcachefs] [ 746.496777] bch2_trans_begin+0x5a5/0x600 [bcachefs] [ 746.496817] bch2_btree_iter_peek_node_and_restart+0x44/0x50 [bcachefs] [ 746.496857] bch2_get_btree_in_memory_pos+0x1d0/0x2e0 [bcachefs] [ 746.496893] bch2_check_backpointers_to_extents+0xa8/0x5d0 [bcachefs] [ 746.496929] bch2_run_recovery_pass+0x38/0xa0 [bcachefs] [ 746.496982] bch2_run_recovery_passes+0xb6/0x180 [bcachefs] [ 746.497034] bch2_fs_recovery+0xc00/0x1330 [bcachefs] [ 746.497084] ? vprintk_emit+0xca/0x270 [ 746.497086] ? __bch2_print+0x87/0xe0 [bcachefs] [ 746.497133] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 746.497183] ? print_mount_opts+0x131/0x180 [bcachefs] [ 746.497229] ? bch2_recalc_capacity+0x106/0x370 [bcachefs] [ 746.497268] bch2_fs_start+0x2f6/0x460 [bcachefs] [ 746.497315] bch2_fs_open+0x6b7/0x6d0 [bcachefs] [ 746.497361] bch2_mount+0x5bd/0x790 [bcachefs] [ 746.497411] legacy_get_tree+0x2b/0x50 [ 746.497415] vfs_get_tree+0x29/0xe0 [ 746.497417] path_mount+0x4ca/0xb10 [ 746.497420] __x64_sys_mount+0x11a/0x150 [ 746.497422] do_syscall_64+0xb8/0x200 [ 746.497425] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 746.497426] RIP: 0033:0x7f129254d11e [ 746.497433] Code: 48 8b 0d fd 2c 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ca 2c 0d 00 f7 d8 64 89 01 48 [ 746.497434] RSP: 002b:00007fff1a971168 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 746.497436] RAX: ffffffffffffffda RBX: 0000559a5e7dbca0 RCX: 00007f129254d11e [ 746.497437] RDX: 0000559a5e7dbca0 RSI: 0000559a5e827210 RDI: 0000559a5e7e0f80 [ 746.497438] RBP: 8000000000000000 R08: 0000559a5e7e5680 R09: 0000000000000000 [ 746.497439] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000020 [ 746.497440] R13: 0000559a5e7e5680 R14: 0000559a5e827210 R15: 0000000000000009 [ 746.497441] [ 746.497442] ---[ end trace 0000000000000000 ]--- [ 953.994534] done [ 953.994538] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_extents_to_backpointers... done [ 1109.278600] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_alloc_to_lru_refs... done [ 1115.947327] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_snapshot_trees... done [ 1115.947341] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_snapshots... done [ 1115.947351] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvols... done [ 1115.955987] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvol_children... done [ 1115.955994] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): delete_dead_snapshots... done [ 1115.955995] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_inodes... done [ 1138.115207] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_extents... done [ 1190.959703] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_indirect_extents... done [ 1213.514006] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_dirents... done [ 1247.130850] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_xattrs... done [ 1247.131092] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_root... done [ 1247.131100] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_subvolume_structure... done [ 1247.131115] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_directory_structure... done [ 1282.683028] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): check_nlinks... done [ 1298.018687] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): resume_logged_ops... done [ 1298.021373] bcachefs (677cf0a7-1abe-4ce3-876c-2ca63301229d): delete_dead_inodes... done ```

EDIT: As fscking after about to insert invalid key in data update path: fatal error recurred didn't find anything (see https://github.com/koverstreet/bcachefs/issues/717#issuecomment-2246058819 ) I suspect that the errors found here are related to the original corrupt btree node before write at btree extents level error.

koverstreet commented 4 weeks ago

Hey, are you on IRC? You've been turning up a bunch of good bugs, we could work through them quicker if you want to hop on there

There should've been more in that error message, did something get dropped?

Lykos153 commented 1 week ago

:/ I always try to catch the whole message, but apparently something got lost. Let's see if it comes back.

I just joined #bcache as lykos153 so we can discuss issues there