koverstreet / bcachefs

Other
662 stars 70 forks source link

Can't mount LVM snapshot of mounted bcachefs filesystem due to UUID conflict #589

Open clipcarl opened 1 year ago

clipcarl commented 1 year ago

I'm running bcachefs 6.4.3.arch1.r1189903.32564d78eb6d from the Arch Linux aur/linux-bcachefs-git.

If LVM (or dd, etc.) is used to take a snapshot of a running bcachefs filesystem the bcachefs driver does not appear to allow the snapshot to be mounted concurrently with the original filesystem because they both have the same UUID. Since nothing done to the snapshot can affect the original filesystem there should be a way to bypass this check. For example, the XFS filesystem has a nouuid mount option which bypasses its duplicate UUID check.

Additionally there should be a way to assign a new UUID to a bcachefs filesystem. Perhaps something like: bcachefs set-option --new-uuid[=<UUID>] <device>

Use case: Mounting a snapshot of a running filesystem is pretty much the only safe, atomic way to take a backup while a system is running.

ticpu commented 11 months ago

By the look of it, that seems more like a feature request than a bug.

clipcarl commented 11 months ago

By the look of it, that seems more like a feature request than a bug.

Mounting LVM snapshots on running systems and changing filesystem UUIDs are part of extremely basic workflows, especially for storage administrators. As such in my opinion the inability to do those things is a bug. For example it means bcachefs can't be used in many environments where atomic backups of a running filesystems are taken. (At least not without major retooling to use bcachefs native snapshots instead which may or may not be trustworthy.)

ticpu commented 11 months ago

It is not a bug in the sense that this has never been implemented before, it is not yet a supported use-case (bcachefs over block device snapshot). It does not prevent a user from using bcachefs on normal block device. It seems to be very important to you and myself included to do debugging (try to fsck a broken bcachefs for example). In those case, you can always umount the old FS and hide the original block devices or export the snapshots via iSCSI or in a VM for another kernel to mount.

It also raise the point that all the devices must be frozen atomically before snapshoting which raises even more concerns on the reliability of the resulting snapshots. This is a problem of integrating bcachefs with block device snapshot tools and still require tooling to enable even when support for changing UUID is added, thus why I say it is a feature request. Don't forget that bcachefs exposes the UUID in /sys/fs/bcachefs, I don't know how that would work with something like nouuid.

clipcarl commented 11 months ago

It is not a bug in the sense that this has never been implemented before, it is not yet a supported use-case (bcachefs over block device snapshot). It does not prevent a user from using bcachefs on normal block device.

You're arguing semantics. Putting a filesystem on top of LVM is extremely common on Linux and bcachefs needs to be usable in these common use cases. Whether or not you want to admit not being able to mount a LVM snapshot is really a bug is irrelevant. Either way it needs to be fixed.

In those case, you can always umount the old FS and hide the original block devices or export the snapshots via iSCSI or in a VM for another kernel to mount.

No, in many cases it is not possible to unmount the old FS nor is spinning up a whole new server or VM just to mount a snapshot reasonable.

It also raise the point that all the devices must be frozen atomically before snapshoting which raises even more concerns on the reliability of the resulting snapshots.

Yes, LVM gives an atomic snapshot. That's why I used the word "atomic" in my original post.

This is a problem of integrating bcachefs with block device snapshot tools and still require tooling to enable ...

If what you're saying is that bcachefs does not make a best effort to ensure crash-consistency the way every other modern filesystem does then that's a problem. However I don't believe you are correct.

Your replies in this report are not at all helpful and I would appreciate it if you would refrain from commenting further. Thanks.

Zygo commented 11 months ago

lvm snapshots are not atomic across multiple LVs, so a multi-device filesystem would have to either suspend writes to all devices while the snapshot LVs are created, or resynchronize the devices when the snapshots are mounted (as if the LVs corresponded to physical drives that were disconnected at different times, since that is what happens to the data as the snapshots are created). You are correct in that this is not an issue for a filesystem using only a single LV.

clipcarl commented 11 months ago

lvm snapshots are not atomic across multiple LVs, so a multi-device filesystem ...

I did not say anything about multiple LVs or multi-device filesystems. In my opinion building a multi-device filesystem on top of LVM would not make a lot of sense outside of testing scenarios.

But I should have pointed out I'm trying to actually use bcachefs not simply test it.