koverstreet / bcachefs

Other
633 stars 69 forks source link

Not able to mount with -o degraded when a disk is missing after hardware failure #703

Open richardbrodie opened 1 week ago

richardbrodie commented 1 week ago

I have an 8-disk array and after one of my disks died suddenly I'm no longer able to mount it since /dev/sdh no longer exists:

❯ sudo bcachefs mount -v -o degraded,errors=remount-ro /dev/sda:/dev/sdb:/dev/sdc:/dev/sdd:/dev/sde:/dev/sdf:/dev/sdg /mnt/storage
DEBUG - bcachefs::commands::mount: Walking udev db!
INFO - bcachefs::commands::mount: mounting with params: device: /dev/sda:/dev/sdb:/dev/sdc:/dev/sdd:/dev/sde:/dev/sdf:/dev/sdg, target: /mnt/storage, options: degraded,errors=remount-ro
DEBUG - bcachefs::commands::mount: parsing mount options: degraded,errors=remount-ro
INFO - bcachefs::commands::mount: mounting filesystem
ERROR - bcachefs::commands::mount: Fatal error: Invalid argument

And in dmesg:

[ 3569.290085] bcachefs: bch2_fs_open() bch_fs_open err opening /dev/sda: insufficient_devices_to_start

If I try and mount it with -o very_degraded it gives the same output. Using mount.bcachefs and mount -t bcachefs give the same output, as does using UUID=55cfeccc-d8b2-4813-b1a4-9ff9212962e7.

I saw that you can remove a disk by ID so I also tried:

❯ sudo bcachefs device remove 4
Filesystem path required when specifying device by id

So it seems that would only work if I could mount the array first, which is exactly the problem.

❯ sudo bcachefs show-super /dev/sda
Device:                                     ST14000NM001G-2K
External UUID:                             55cfeccc-d8b2-4813-b1a4-9ff9212962e7
Internal UUID:                             fb8e9660-7eb6-45b4-a62a-bfbe3458974e
Magic number:                              c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                              0
Label:                                     
Version:                                   1.7: mi_btree_bitmap
Version upgrade complete:                  1.7: mi_btree_bitmap
Oldest version on disk:                    1.3: rebalance_work
Created:                                   Sun Jan 21 14:27:32 2024
Sequence number:                           669
Time of last write:                        Sun Jun 30 03:01:25 2024
Superblock size:                           10.6 KiB/1.00 MiB
Clean:                                     0
Devices:                                   8
Sections:                                  members_v1,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                  journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                           alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                              4.00 KiB
  btree_node_size:                         256 KiB
  errors:                                  continue [fix_safe] panic ro 
  metadata_replicas:                       2
  data_replicas:                           1
  metadata_replicas_required:              1
  data_replicas_required:                  1
  encoded_extent_max:                      64.0 KiB
  metadata_checksum:                       none [crc32c] crc64 xxhash 
  data_checksum:                           none [crc32c] crc64 xxhash 
  compression:                             none
  background_compression:                  none
  str_hash:                                crc32c crc64 [siphash] 
  metadata_target:                         none
  foreground_target:                       none
  background_target:                       none
  promote_target:                          none
  erasure_code:                            0
  inodes_32bit:                            1
  shard_inode_numbers:                     1
  inodes_use_key_cache:                    1
  gc_reserve_percent:                      8
  gc_reserve_bytes:                        0 B
  root_reserve_percent:                    0
  wide_macs:                               0
  acl:                                     1
  usrquota:                                0
  grpquota:                                0
  prjquota:                                0
  journal_flush_delay:                     1000
  journal_flush_disabled:                  0
  journal_reclaim_delay:                   100
  journal_transaction_names:               1
  version_upgrade:                         [compatible] incompatible none 
  nocow:                                   0

members_v2 (size 1104):
Device:                                    0
  Label:                                   hd1 (1)
  UUID:                                    4334f09b-b198-4957-8e13-0bdfc3eb8c42
  Size:                                    12.7 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 26703872
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        512 MiB
  Btree allocated bitmap:                  0000000000000000000000000000000111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    1
  Label:                                   hd2 (2)
  UUID:                                    97050b01-c590-479e-a3d8-f7a1c1337c54
  Size:                                    2.73 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 5723176
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        64.0 MiB
  Btree allocated bitmap:                  0000000011111111111111111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    2
  Label:                                   hd3 (3)
  UUID:                                    49f40e0e-fa15-4015-be19-49584b2cf1e4
  Size:                                    2.73 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 5723176
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        64.0 MiB
  Btree allocated bitmap:                  0000000011111111111111111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    3
  Label:                                   hd4 (4)
  UUID:                                    95a6b8ee-8de9-438b-afb0-3f08b3d2d253
  Size:                                    2.73 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 5723176
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        64.0 MiB
  Btree allocated bitmap:                  0000000011111111111111111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    4
  Label:                                   hd5 (5)
  UUID:                                    c5eacabf-9858-4897-b4b3-d7aa6bd16b45
  Size:                                    12.7 TiB
  read errors:                             24
  write errors:                            49
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             1.00 MiB
  First bucket:                            0
  Buckets:                                 13351936
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        512 MiB
  Btree allocated bitmap:                  0000000000000000000000000000000111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    5
  Label:                                   hd6 (6)
  UUID:                                    2eeb7acc-7e82-4a16-a18c-f32ca665346c
  Size:                                    7.28 TiB
  read errors:                             0
  write errors:                            1
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             1.00 MiB
  First bucket:                            0
  Buckets:                                 7630885
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        128 MiB
  Btree allocated bitmap:                  1111111111111111111111111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    6
  Label:                                   hd7 (7)
  UUID:                                    42798165-be77-49a2-a5c9-b4c998a740c4
  Size:                                    12.7 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             1.00 MiB
  First bucket:                            0
  Buckets:                                 13351936
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        256 MiB
  Btree allocated bitmap:                  0000000000000000000000001111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    7
  Label:                                   hd8 (8)
  UUID:                                    92c555b5-284b-4dba-8424-640ecb812315
  Size:                                    7.28 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             1.00 MiB
  First bucket:                            0
  Buckets:                                 7630885
  Last mount:                              Fri Jun 21 08:10:49 2024
  Last superblock write:                   669
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user
  Btree allocated bitmap blocksize:        64.0 MiB
  Btree allocated bitmap:                  0011111111111111111111111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1

errors (size 8):

Some extra info:

❯ uname -r
6.9.7-arch1-1

❯ bcachefs version
1.9.1

❯ paru -Qs bcachefs
local/bcachefs-tools 3:1.9.2-1
richardbrodie commented 2 days ago

Alright, so I decided to try downgrading bcachefs-tools to 1.7.0 and giving it another try and lo and behold, it worked! So this seems like it might just be a tools bug in 1.9.x.

The command I ended up running:

sudo bcachefs mount UUID=55cfeccc-d8b2-4813-b1a4-9ff9212962e7 /mnt/storage \
-o fsck,fix_errors,very_degraded,nochanges,read_only,opts=ro,errors=ro

It took several hours, and spat out a lot of

btree trans held srcu lock (delaying memory reclaim) by more than 31 seconds

warnings, but otherwise seems to have had no further trouble mounting the 7 remaining disks.

I'm backing up the important stuff before I try any more things, and obviously right now there is a lot of this in dmesg (definitely not unexpected at this point):

[111019.630521] bcachefs (55cfeccc-d8b2-4813-b1a4-9ff9212962e7 inum 335745235 offset 1835008): no device to read from [111019.717764] bcachefs (sde inum 134673851 offset 40): data read error: I/O [111019.717841] bcachefs (55cfeccc-d8b2-4813-b1a4-9ff9212962e7 inum 134673851 offset 20480): read error 3 from btree lookup