koverstreet / bcachefs

Other
643 stars 71 forks source link

Can't mount fs, nonexistent inode ... in deleted_inodes btree [e7b654a58af9] #610

Closed ticpu closed 8 months ago

ticpu commented 8 months ago

Generic info show-super:

External UUID:                              d0337b38-cbe7-4c3c-90f7-4305fbe05171
Internal UUID:                              3b2af93d-5b21-47e1-8abd-15d2a46ff64a
Device index:                               2
Label:                                      
Version:                                    1.3: rebalance_work
Version upgrade complete:                   1.3: rebalance_work
Oldest version on disk:                     1.1: snapshot_skiplists
Created:                                    Sat Aug 26 16:58:08 2023

Sequence number:                            328
Superblock size:                            7808
Clean:                                      0
Devices:                                    5
Sections:                                   members_v1,crypt,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors
Features:                                   lz4,zstd,journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                            alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                               4.00 KiB
  btree_node_size:                          256 KiB
  errors:                                   continue [ro] panic 
  metadata_replicas:                        2
  data_replicas:                            2
  metadata_replicas_required:               2
  data_replicas_required:                   1
  encoded_extent_max:                       64.0 KiB
  metadata_checksum:                        none [crc32c] crc64 xxhash 
  data_checksum:                            none [crc32c] crc64 xxhash 
  compression:                              none
  background_compression:                   zstd
  str_hash:                                 crc32c crc64 [siphash] 
  metadata_target:                          nvme
  foreground_target:                        nvme
  background_target:                        none
  promote_target:                           nvme
  erasure_code:                             0
  inodes_32bit:                             1
  shard_inode_numbers:                      1
  inodes_use_key_cache:                     1
  gc_reserve_percent:                       8
  gc_reserve_bytes:                         0 B
  root_reserve_percent:                     0
  wide_macs:                                0
  acl:                                      1
  usrquota:                                 0
  grpquota:                                 0
  prjquota:                                 0
  journal_flush_delay:                      1000
  journal_flush_disabled:                   0
  journal_reclaim_delay:                    100
  journal_transaction_names:                1
  version_upgrade:                          [compatible] incompatible none 
  nocow:                                    0

crypt (size 64):
  KFD:               0
  scrypt n:          0
  scrypt r:          0
  scrypt p:          0

replicas_v0 (size 160):
  user: 1 [4] btree: 2 [0 1] user: 2 [2 3] journal: 3 [0 3 4] user: 1 [0] user: 2 [0 4] cached: 1 [1] journal: 3 [0 1 3] journal: 3 [1 3 4] btree: 3 [0 1 3] user: 1 [2] user: 2 [0 2] user: 2 [1 3] user: 2 [3 4] cached: 1 [3] journal: 2 [0 1] journal: 3 [0 2 3] journal: 3 [1 2 4] journal: 3 [2 3 4] btree: 3 [0 1 2] btree: 3 [0 1 4] user: 1 [1] user: 1 [3] user: 2 [0 1] user: 2 [0 3] user: 2 [1 2] user: 2 [1 4] user: 2 [2 4] cached: 1 [0] cached: 1 [2] cached: 1 [4] journal: 1 [0] journal: 3 [0 1 2] journal: 3 [0 1 4] journal: 3 [0 2 4] journal: 3 [1 2 3]

members_v2 (size 616):
  Device:                                   0
    Label:                                  1 (1)
    UUID:                                   600a073b-417a-4f27-a8ed-e8959a3cb13c
    Size:                                   200 GiB
    read errors:                            0
    write errors:                           0
    checksum errors:                        0
    seqread iops:                           0
    seqwrite iops:                          0
    randread iops:                          0
    randwrite iops:                         0
    Bucket size:                            256 KiB
    First bucket:                           0
    Buckets:                                819200
    Last mount:                             Sat Nov 11 12:55:51 2023

    State:                                  rw
    Data allowed:                           journal,btree,user
    Has data:                               journal,btree,user,cached
    Discard:                                1
    Freespace initialized:                  1
  Device:                                   1
    Label:                                  2 (2)
    UUID:                                   f106c284-7a5c-4828-8111-fdf0f2d63228
    Size:                                   100 GiB
    read errors:                            0
    write errors:                           0
    checksum errors:                        0
    seqread iops:                           0
    seqwrite iops:                          0
    randread iops:                          0
    randwrite iops:                         0
    Bucket size:                            256 KiB
    First bucket:                           0
    Buckets:                                409600
    Last mount:                             Sat Nov 11 12:55:51 2023

    State:                                  rw
    Data allowed:                           journal,btree,user
    Has data:                               journal,btree,user,cached
    Discard:                                1
    Freespace initialized:                  1
  Device:                                   2
    Label:                                  1 (4)
    UUID:                                   97b5abdf-9563-4ab8-9fff-a1e86a6ae4a6
    Size:                                   800 GiB
    read errors:                            0
    write errors:                           0
    checksum errors:                        0
    seqread iops:                           0
    seqwrite iops:                          0
    randread iops:                          0
    randwrite iops:                         0
    Bucket size:                            256 KiB
    First bucket:                           0
    Buckets:                                3276800
    Last mount:                             Sat Nov 11 12:55:51 2023

    State:                                  rw
    Data allowed:                           journal,btree,user
    Has data:                               journal,btree,user,cached
    Discard:                                0
    Freespace initialized:                  1
  Device:                                   3
    Label:                                  2 (5)
    UUID:                                   14a64f04-883f-4cc9-8fa2-2c4b1a131aff
    Size:                                   800 GiB
    read errors:                            0
    write errors:                           0
    checksum errors:                        0
    seqread iops:                           0
    seqwrite iops:                          0
    randread iops:                          0
    randwrite iops:                         0
    Bucket size:                            256 KiB
    First bucket:                           0
    Buckets:                                3276800
    Last mount:                             Sat Nov 11 12:55:51 2023

    State:                                  rw
    Data allowed:                           journal,btree,user
    Has data:                               journal,btree,user,cached
    Discard:                                0
    Freespace initialized:                  1
  Device:                                   4
    Label:                                  3 (6)
    UUID:                                   72182ee5-f08e-4a03-8c35-2143d249b56d
    Size:                                   500 GiB
    read errors:                            0
    write errors:                           0
    checksum errors:                        0
    seqread iops:                           0
    seqwrite iops:                          0
    randread iops:                          0
    randwrite iops:                         0
    Bucket size:                            256 KiB
    First bucket:                           0
    Buckets:                                2048000
    Last mount:                             Sat Nov 11 12:55:51 2023

    State:                                  rw
    Data allowed:                           journal,btree,user
    Has data:                               journal,btree,user,cached
    Discard:                                0
    Freespace initialized:                  1

Tools bugs Screenshot_20231111-084354_RVNC_Viewer Screenshot_20231111-084223_RVNC_Viewer

If the tools lockup: bcachefs tool version v1.3.3-4-g73da05d

Output of dmesg

[54031.589707] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): mounting version 1.3: rebalance_work opts=metadata_replicas=2,data_replicas=2,metadata_replicas_required=2,background_compression=zstd,metadata_target=nvme,foreground_target=nvme,promote_target=nvme,fix_errors=yes
[54031.589711] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): recovering from unclean shutdown
[54038.042131] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): ja->sectors_free == ca->mi.bucket_size
[54038.042134] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): cur_idx 0/3200
[54038.042136] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[3199] = 3441734
[54038.042137] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[0] = 3441737
[54038.042138] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[1] = 3441740
[54043.025333] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): ja->sectors_free == ca->mi.bucket_size
[54043.025336] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): cur_idx 0/8192
[54043.025338] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[8191] = 3357069
[54043.025339] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[0] = 3357072
[54043.025340] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[1] = 3357075
[54043.223241] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): ja->sectors_free == ca->mi.bucket_size
[54043.223243] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): cur_idx 0/8192
[54043.223245] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[8191] = 3399365
[54043.223246] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[0] = 3399444
[54043.223247] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): bucket_seq[1] = 3399520
[54043.223258] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): journal read done, replaying entries 3568568-3568777
[54043.269468] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): alloc_read... done
[54043.409405] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): stripes_read... done
[54043.409409] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): snapshots_read... done
[54043.521708] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): journal_replay...
[54043.522911] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): going read-write
[54046.647508]  done
[54046.647516] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): resume_logged_ops... done
[54046.656048] bcachefs (d0337b38-cbe7-4c3c-90f7-4305fbe05171): delete_dead_inodes...
[54046.658025] nonexistent inode 660746:4294967292 in deleted_inodes btree, fixing
*lock up and spin*

Optional Advanced Saving this report, will reboot in new kernel to get perf dumps.

Logs of fsck In order of execution:

  1. bcfs-3.log then stuck 1 hour
  2. bcfs-2.log then stuck 8 hours
  3. bcfs.log then stuck at nonexistent inode 660746:4294967292 in deleted_inodes btree like in dmesg for 3 hours.
ticpu commented 8 months ago

Added latest perf, gdb and log from tools version v1.3.3-4-g73da05d

ticpu commented 8 months ago

After adding tracepoints and workarounds, it was noted that most of these issues stemmed from snapshots being taken while there are unlinked inodes. The error is now fixed in 897d32dffe59196112c185e9e692f41836eef2f9 with all the details of the fix in the commit.