koverstreet / bcachefs-tools

http://bcachefs.org
GNU General Public License v2.0
120 stars 89 forks source link

Unmount hang when unable to satisfy replicas #214

Open boomshroom opened 8 months ago

boomshroom commented 8 months ago

Linux 6.7.0, Nixos unstable 24.05, with bcachefs 1.3 in the kernel and bcachefs-tools alternating between 1.3.6 and 1.4.1.

While copying files over from a previous filesystem in hopes of adding its drive to my bcachefs pool, the foreground and promote targets (both shared between the same 2 SSDs), which while not set as metadata targets, held all the metadata, filled up nearly completely at which point the file system and anything trying to access it hanged. The bcachefs rebalance thread seemed idle despite plenty of work to do to move files to the background target (which had plenty of space available). I rebooted in hopes of restarting clean.

Shutting down took longer than expected, with various sd processes failing to stop and the bcachefs file system failing to unmount. After rebooting, I was greated with this: PXL_20240118_043921638

Booting into a recovery mode to run fsck, I got ENOENT_bkey_type_mismatch errors when checking the directory structure. (Worth noting that, while they weren't from the files I was copying, there are plenty of hardlinks in the file system, each of which appears to lack a backpointer, which could interfere with reconstructing the directory tree.) I got this same error when running with either bcachefs-tools 1.3.6 or 1.4.1.

Including reconstruct_alloc in the fsck (only tested with bcachefs-tools 1.4.1 currently) gave

btree node read error: no device to read from
 at extents level 0/2
  u64s 104 type {sometimes `error`, sometimes `226`) 10971744:140709067911984:13 len 3154 ver 47123264480247801 {or similar}
Unreadable btree node at btree extents level 0:
  {same entry as before}

If I say yes to fixing, it gives

Halting mark and sweep to start topology repair pass
running explicit recovery pass check_topology (4), currently at check_allocations (5)
corrupted size vs. prev_size while consolidating
Aborted

If I say no, it instead immediately segmentation faults.

I haven't tried mounting it in the recovery mode yet. I also apologize if I was too verbose or unprofessional in describing each step. It all seemed relevant to me as it was less a single error and more a long chain of errors. The fact that this was my root filesystem makes it difficult to extract logs, but I will do my best to provide any additional information you require.

boomshroom commented 8 months ago

I was able to replicate the first issue in a more controlled manner. 3 devices with two marked as foreground and metadata targets, and 1 marked as background target. replicas set to 2 but replicas_required set to 1. Write a file that would be able to fit with replicas=1, but not replicas=2. Hopefully get an out of space error. Unmount the device making it hang for several minutes with seemingly no activity.