koverstreet / bcachefs

Other
686 stars 71 forks source link

Mongodb files are corrupted , and FS can not be mounted in snapshots deletion and creation. #586

Closed bhzhu203 closed 1 year ago

bhzhu203 commented 1 year ago

version 117cd823d2d77dd8db1ec35b5819e1668c736b31 FS dump file https://1drv.ms/u/s!Ao8p2C5olADnhivwvJgQfVczA9AI?e=xKEEPf

[ 2871.030246] bcachefs: loading out-of-tree module taints kernel.
[ 2871.111921] bcachefs (vdb): mounting version 1.2: deleted_inodes opts=compression=lz4
[ 2871.112443] bcachefs (vdb): recovering from unclean shutdown
[ 2871.112772] bcachefs (vdb): starting journal read
[ 2885.674784] bcachefs (vdb): journal read done on device vdb, ret 0
[ 2885.675292] bcachefs (vdb): journal read done, replaying entries 4052220-4052220
[ 2885.727978] bcachefs (vdb): alloc_read... done
[ 2885.732216] bcachefs (vdb): stripes_read... done
[ 2885.732494] bcachefs (vdb): snapshots_read... done
[ 2885.733202] bcachefs (vdb): journal_replay... done
[ 2885.733480] bcachefs (vdb): delete_dead_snapshots...
[ 2885.733481] bcachefs (vdb): going read-write
[ 2894.316227]  done
[ 2894.316431] bcachefs (vdb): delete_dead_inodes... done
[ 2894.318752] bcachefs (vdb): bch2_inode_peek(): error looking up inum 1:4096: ENOENT_inode
[ 2894.319346] bcachefs (vdb): error mounting: error getting root inode: ENOENT
[ 2894.319976] bcachefs (vdb): shutting down
[ 2894.337835] bcachefs (vdb): flushing journal and stopping allocators, journal seq 4052228
[ 2894.338699] bcachefs (vdb): flushing journal and stopping allocators complete, journal seq 4052228
[ 2894.339961] BUG: kernel NULL pointer dereference, address: 0000000000000030
[ 2894.340373] #PF: supervisor write access in kernel mode
[ 2894.340687] #PF: error_code(0x0002) - not-present page
[ 2894.340973] PGD 0 P4D 0 
[ 2894.341125] Oops: 0002 [#1] PREEMPT SMP PTI
[ 2894.341382] CPU: 0 PID: 24392 Comm: mount Kdump: loaded Tainted: G           O       6.4.0-uksm+ #1 8822af7ab4de9d8141e69cf85255c8ee5c8572a9
[ 2894.342067] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 449e491 04/01/2014
[ 2894.342484] RIP: 0010:bch2_recalc_capacity+0xee/0x2a0 [bcachefs]
[ 2894.342865] Code: 05 00 00 83 c3 01 39 d3 0f 82 66 ff ff ff e8 39 f0 78 e0 49 8b 87 98 01 00 00 48 85 c0 74 0f 48 8b 80 d0 00 00 00 41 83 e4 ff <4c> 89 60 30 31 db 31 ed 45 31 ed 45 31 e4 e8 4f b4 78 e0 41 0f b6
[ 2894.343884] RSP: 0018:ffffc90000e73c18 EFLAGS: 00010206
[ 2894.344177] RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff8881069c3738
[ 2894.344571] RDX: 0000000000000001 RSI: ffff888178723000 RDI: ffff888107802e00
[ 2894.344968] RBP: ffff888178723000 R08: 0000000000000000 R09: 0002a1e41ae4ad00
[ 2894.345359] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000400
[ 2894.345769] R13: ffff888178723000 R14: ffff8881069c01f8 R15: ffff8881069c0000
[ 2894.346167] FS:  00007f68f0cbb800(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
[ 2894.346614] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2894.346935] CR2: 0000000000000030 CR3: 0000000109eb6002 CR4: 00000000003706f0
[ 2894.347350] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2894.347746] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2894.348139] Call Trace:
[ 2894.348286]  <TASK>
[ 2894.348416]  ? __die+0x1f/0x60
[ 2894.348606]  ? page_fault_oops+0x141/0x450
[ 2894.349105]  ? do_user_addr_fault+0x61/0x720
[ 2894.349551]  ? _raw_spin_unlock+0x12/0x30
[ 2894.349983]  ? exc_page_fault+0x67/0x140
[ 2894.350394]  ? asm_exc_page_fault+0x22/0x30
[ 2894.350832]  ? bch2_recalc_capacity+0xee/0x2a0 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.351530]  ? bch2_recalc_capacity+0xd7/0x2a0 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.352216]  bch2_dev_allocator_remove+0x45/0x1b0 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.352915]  ? __cancel_work_timer+0xca/0x150
[ 2894.353348]  ? _raw_spin_unlock_irq+0x13/0x30
[ 2894.353782]  __bch2_fs_read_only+0x192/0x1e0 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.354461]  bch2_fs_read_only+0xc6/0x2e0 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.355134]  ? __cancel_work_timer+0xca/0x150
[ 2894.355569]  __bch2_fs_stop+0x44/0x270 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.356256]  bch2_fs_stop+0xe/0x20 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.356919]  bch2_mount+0x611/0x680 [bcachefs 08819be1bc4f2440ce2e1bee493a113ef8d423ac]
[ 2894.357603]  legacy_get_tree+0x24/0x40
[ 2894.358020]  vfs_get_tree+0x1f/0xc0
[ 2894.358418]  path_mount+0x2b0/0xa70
[ 2894.358821]  __x64_sys_mount+0xe1/0x120
[ 2894.359234]  do_syscall_64+0x35/0x80
[ 2894.359655]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 2894.360127] RIP: 0033:0x7f68f0b26eae
[ 2894.360524] Code: 48 8b 0d 85 1f 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 1f 0f 00 f7 d8 64 89 01 48
[ 2894.361922] RSP: 002b:00007ffc225ac598 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[ 2894.362554] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f68f0b26eae
[ 2894.363154] RDX: 00005605fba88d80 RSI: 00005605fba88e00 RDI: 00005605fba88d60
[ 2894.363749] RBP: 00005605fba88b30 R08: 00005605fba88dc0 R09: 00005605fba89af0
[ 2894.364335] R10: 0000000000000400 R11: 0000000000000246 R12: 0000000000000000
[ 2894.364922] R13: 00005605fba88d80 R14: 00005605fba88d60 R15: 00005605fba88b30
[ 2894.365505]  </TASK>
[ 2894.365833] Modules linked in: bcachefs(O) mean_and_variance netconsole rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tcp_diag inet_diag sunrpc binfmt_misc nls_utf8 nls_cp437 intel_rapl_msr intel_rapl_common joydev serio_raw virtio_console virtio_balloon evdev squashfs loop dm_multipath dm_mod msr fuse efi_pstore ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear md_mod nvme_tcp nvme_rdma rdma_cm iw_cm ib_cm ib_core configfs nvme_fc nvme_fabrics crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 virtio_net net_failover failover virtio_blk aesni_intel cirrus drm_shmem_helper crypto_simd drm_kms_helper cryptd psmouse i2c_piix4 virtio_pci drm virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring i2c_core floppy pata_acpi button
[ 2894.371265] CR2: 0000000000000030
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_recalc_capacity+0xee/0x2a0
bch2_recalc_capacity+0xee/0x2a0:
bch2_set_ra_pages at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/bcachefs.h:1083
(inlined by) bch2_recalc_capacity at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/alloc_background.c:2039
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_recalc_capacity+0xd7/0x2a0
bch2_recalc_capacity+0xd7/0x2a0:
bch2_set_ra_pages at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/bcachefs.h:1082
(inlined by) bch2_recalc_capacity at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/alloc_background.c:2039
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_dev_allocator_remove+0x45/0x1b0
bch2_dev_allocator_remove+0x45/0x1b0:
bch2_dev_allocator_remove at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/alloc_background.c:2128
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  __bch2_fs_read_only+0x192/0x1e0
__bch2_fs_read_only+0x192/0x1e0:
rcu_read_lock at /home/bhzhu/source/bcachfs-github/bcachefs/./include/linux/rcupdate.h:771
(inlined by) percpu_ref_put_many at /home/bhzhu/source/bcachfs-github/bcachefs/./include/linux/percpu-refcount.h:330
(inlined by) percpu_ref_put at /home/bhzhu/source/bcachfs-github/bcachefs/./include/linux/percpu-refcount.h:351
(inlined by) __bch2_fs_read_only at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/super.c:246
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_fs_read_only+0xc6/0x2e0
bch2_fs_read_only+0xc6/0x2e0:
might_resched at /home/bhzhu/source/bcachfs-github/bcachefs/./include/linux/kernel.h:111
(inlined by) bch2_fs_read_only at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/super.c:298
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko __bch2_fs_stop+0x44/0x270
__bch2_fs_stop+0x44/0x270:
__bch2_fs_stop at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/super.c:553
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_fs_stop+0xe/0x20
bch2_fs_stop+0xe/0x20:
bch2_fs_stop at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/super.c:610
?bhzhu?~/source/bcachfs-github/bcachefs(git:master)??? scripts/faddr2line  fs/bcachefs/bcachefs.ko  bch2_mount+0x611/0x680
bch2_mount+0x611/0x680:
bch2_err_class at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/errcode.h:249
(inlined by) bch2_mount at /home/bhzhu/source/bcachfs-github/bcachefs/fs/bcachefs/fs.c:1912
koverstreet commented 1 year ago

There was a skiplist pointer that didn't point to an actual ancestor, and the fsck code wasn't checking skiplist pointers correctly.

Fixed in e7f6215768 - run fsck with that or latest tools and your filesystem should work again :)