koverstreet / bcachefs-tools

http://bcachefs.org
GNU General Public License v2.0
120 stars 89 forks source link

What's really the maximum number of replicas? (`BCH_REPLICAS_MAX`) #238

Open kristianlm opened 7 months ago

kristianlm commented 7 months ago

Hi, and thanks for creating bcachefs! I've been watching/supporting this project for a while and I'm excited it now finally lives in mainline kernel.

I want to experiment with bcachefs on 4 very old disks that I don't trust. So I wanted to have 4 copies of the metadata, but the error messages I'm getting are a little confusing and inconsistent. Let's have a look:

> sudo mkfs.bcachefs --metadata_replicas=4 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3                                                                                                             
Invalid option metadata_replicas: too big (max 4)

I would expect that to work, since maximum values usually are inclusive. If I issue this instead:

> sudo mkfs.bcachefs --replicas=4 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 >/dev/null && echo okay
okay

That works. But it probably shouldn't have, because when I try to mount that, I get this error:

> sudo mount -t bcachefs -o verbose,fsck /dev/loop0:/dev/loop1:/dev/loop2:/dev/loop3 /mnt
mount: /mnt: mount(2) system call failed: Numerical result out of range.

With dmesg showing the same error message as --metadata_repliacas=4:

[244418.757806] bcachefs (/dev/loop0): error validating superblock: Invalid option metadata_replicas: too big (max 4)

The different behaviour of --replicas and --{meta}data-replicas is probably straight-forward to fix. However, I think this goes deeper. Aren't maximum values normally inclusive? The replica limit seems to be defined here:

#define BCH_REPLICAS_MAX        4U

Based on its wording and its use in this codebase, I would expect 4 replicas to be supported. Here's an example from this struct:

struct btree_node_read_all {
    struct closure      cl;
    struct bch_fs       *c;
    struct btree        *b;
    unsigned        nr;
    void            *buf[BCH_REPLICAS_MAX];
    struct bio      *bio[BCH_REPLICAS_MAX];
    blk_status_t        err[BCH_REPLICAS_MAX];
};

Which allocates a total of 4 elements - one for each of the replicas, no? I'm guessing that BCH_REPLICAS_MAX really is inclusive and that 4 replicas should be supported. A very naîve fix here would be to replace > with <= like this (in bcachefs-tools and kernel):

diff --git a/libbcachefs/opts.c b/libbcachefs/opts.c
index b1ed0b9..42bbae1 100644
--- a/libbcachefs/opts.c
+++ b/libbcachefs/opts.c
@@ -268,7 +268,7 @@ int bch2_opt_validate(const struct bch_option *opt, u64 v, struct printbuf *err)
                return -BCH_ERR_ERANGE_option_too_small;
        }

-       if (opt->max && v >= opt->max) {
+       if (opt->max && v > opt->max) {
                if (err)
                        prt_printf(err, "%s: too big (max %llu)",
                               opt->attr.name, opt->max);

But I don't know where that would take us. Please consider looking into this. Thank you!

koverstreet commented 6 months ago

I'd like to just get rid of BCH_REPLICAS_MAX, if we can. Now that we've got DARRAY_PREALLOCATED(), this out to be feasible.