trapexit / mergerfs-tools

Optional tools to help manage data in a mergerfs pool
ISC License
372 stars 42 forks source link

mergerfs.balance is not compatible with zfs pools #118

Closed joakimlemb closed 3 years ago

joakimlemb commented 3 years ago

Trying to run mergerfs on 3 zfs pools and using mergerfs.balance to balance the pools reports the wrong free space value:

mergerfs.balance -s 500M /mnt/mergerfs-storage/
Branches within 2.0% range:
 * /mnt/zpool_1: 100.00% free
 * /mnt/zpool_2: 100.00% free
 * /mnt/zpool_3: 100.00% free

zpool list:
NAME           SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zpool_1    9.09T   641G  8.47T        -         -     0%     6%  1.00x    ONLINE  -
zpool_2    7.27T  6.93T   342G        -         -     0%    95%  1.00x    ONLINE  -
zpool_3    7.27T  7.00T   275G        -         -     0%    96%  1.00x    ONLINE  -
trapexit commented 3 years ago

It uses statfs and reports what the kernel tells it.

https://github.com/trapexit/mergerfs-tools/blob/master/src/mergerfs.balance#L144

What does df -h say?

joakimlemb commented 3 years ago
df -h|grep zpool
zpool_1                      8.9T  1.8T  7.1T  20% /mnt/zpool_1
zpool_2                     34G  256K   34G   1% /mnt/zpool_2
zpool_3                    1.2T  256K  1.2T   1% /mnt/zpool_3

Seems to be a "problem" when using datasets in zfs, similear issue with samba disk space reporting: https://stanislavs.org/reporting-correct-space-usage-for-samba-shared-zfs-volumes/ Some more details: https://oshogbo.vexillium.org/blog/65/

So when datasets are used, ZFS reports disk usage in a different way than usual to the kernel and the only way to get the correct output is to ask ZFS with "zfs get". Basicly making any non-zfs aware scripts/applications report incorrect disk usage. I have not been able to find any options to change this either in ZFS.

Not sure if you want this open as an issue or maybe convert it to a feature request to make it "zfs aware"?

trapexit commented 3 years ago

It's pretty unreasonable for zfs devs to expect the world to use bespoke tooling to do something that has been pretty standard in POSIX filesystems for decades. Adding zfs awareness to this tool is plausible but there is no way I can do what is suggested to mergerfs proper which is where it would be most important. If there was an efficient ioctl call or something then maybe but what is proposed in those links is a complete no go.

joakimlemb commented 3 years ago

I completely agree on that sentiment. (reminds me of the "don't break userspace" rule of kernel development) Probably no point in making mergerfs.balance ZFS aware when mergerfs itself is not.

Might be a good FAQ entry in the readme though. After all, it works fine if you don't use any datasets in the pool, but you loose a lot of flexibility that those provide.