trapexit / mergerfs

a featureful union filesystem
http://spawn.link
Other
4.31k stars 174 forks source link

MergerFS mount randomly disappears, only displays ??? when listed #1290

Closed gogo199432 closed 8 months ago

gogo199432 commented 10 months ago

Describe the bug

MergerFS mount seem to randomly disappear, and just give back "cannot access '/Storage': Input/output error" when trying to ls the filesystem root. At this point I need to restart the mergerfs service for it to reappear. However this means I have to re-export my NFS point, which in turn means I have to remount or restart my services that use it.

I'm having a really hard time narrowing down what could cause it, to the point that even now I don't have any idea why it happens. But it has been happening since I implemented MergerFS 1-2 months ago. For context my storage is on a Proxmox box, that runs one LXC container with my postgresql server, one VM for Jellyfin and several VMs that act as K3S nodes. The MergerFS mount is accessed through NFS in both the Jellyfin VM and in all K3S nodes.

I have 4 disks, all with EXT4 FS-s that are all mounted under /mnt as disk1-4 . These are then merged and mounted under /Storage.

To Reproduce

As mentioned it is really random, however a scheduled backup that runs at midnight in Proxmox seems to be the most reliable way. Weirdly even that fails at random points, sometimes it manages to completely save the backup and the mount dies after the backup ends and sends my notification email. But I had instances where it disappeared mid-process.

I also had it disappear while using Radarr or Sonarr to import media, but those are not a reliable way to reproduce I have found.

Expected behavior

Function as expected. Shouldn't disappear and break NFS

System information:

[Service] Type=simple KillMode=control-group ExecStart=/usr/bin/mergerfs \ -f \ -o cache.files=partial,moveonenospc=true,category.create=mfs,dropcacheonclose=true,posix_acl=true,noforget,inodecalc=path-hash,fsname=mergerfs \ /mnt/disk* \ /Storage ExecStop=/bin/fusermount -uz /Storage Restart=on-failure

[Install] WantedBy=default.target

 - List of drives, filesystems, & sizes:
   - `df -h`

Filesystem Size Used Avail Use% Mounted on udev 16G 0 16G 0% /dev tmpfs 3.2G 2.9M 3.2G 1% /run /dev/mapper/pve-root 28G 12G 15G 45% / tmpfs 16G 34M 16G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock efivarfs 128K 50K 74K 41% /sys/firmware/efi/efivars /dev/sdf 3.6T 28K 3.4T 1% /mnt/disk4 /dev/sde 3.6T 28K 3.4T 1% /mnt/disk3 /dev/sda 19T 2.2T 16T 13% /mnt/disk1 /dev/sdb 19T 1.3T 16T 8% /mnt/disk2 /dev/fuse 128M 20K 128M 1% /etc/pve tmpfs 3.2G 0 3.2G 0% /run/user/0 mergerfs 44T 3.4T 38T 9% /Storage

   - `lsblk -f`

NAME FSTYPE FSVER LABEL UUID FSAVAIL FSUSE% MOUNTPOINTS sda ext4 1.0 20T_disk_1 3bea15fe-0c62-42ad-bc73-727c7e6ed147 15.1T 12% /mnt/disk1 sdb ext4 1.0 20T_disk_2 9306d268-2f54-42f3-958b-d8555b470bf0 15.9T 7% /mnt/disk2 sdc ├─sdc1 ├─sdc2 vfat FAT32 EDFC-8E51 └─sdc3 LVM2_member LVM2 001 a9c81x-tUS5-CcN9-3w5u-ZF84-ODxd-r21cM9 ├─pve-swap swap 1 3a923fc4-0c8f-4ba3-92a8-b3515283e669 [SWAP] ├─pve-root ext4 1.0 ecd841a9-5d7b-4d70-a575-448fb85d8f51 14.2G 43% / ├─pve-data_tmeta │ └─pve-data-tpool │ └─pve-data └─pve-data_tdata └─pve-data-tpool └─pve-data sdd ├─sdd1 ext4 1.0 BigBackup 29c4fd80-9fc5-4d1d-a783-cba4372cffc0 └─sdd2 LVM2_member LVM2 001 pB5Bes-ARIU-XsLl-ryLc-Nw1A-Ofnj-OEzODe ├─vmdata-bigthin_tmeta │ └─vmdata-bigthin-tpool │ ├─vmdata-bigthin │ ├─vmdata-vm--101--disk--0 │ ├─vmdata-vm--102--disk--0 │ ├─vmdata-vm--103--disk--0 │ ├─vmdata-vm--104--disk--0 │ ├─vmdata-vm--105--disk--0 │ ├─vmdata-vm--111--disk--0 │ ├─vmdata-vm--107--disk--0 │ └─vmdata-vm--100--disk--1 ext4 1.0 a2234f63-38da-43fb-877a-a3e836f4004e └─vmdata-bigthin_tdata └─vmdata-bigthin-tpool ├─vmdata-bigthin ├─vmdata-vm--101--disk--0 ├─vmdata-vm--102--disk--0 ├─vmdata-vm--103--disk--0 ├─vmdata-vm--104--disk--0 ├─vmdata-vm--105--disk--0 ├─vmdata-vm--111--disk--0 ├─vmdata-vm--107--disk--0 └─vmdata-vm--100--disk--1 ext4 1.0 a2234f63-38da-43fb-877a-a3e836f4004e sde ext4 1.0 4T_disk_1 06207dd1-fc54-4faf-805d-a880dc432bc4 3.4T 0% /mnt/disk3 sdf ext4 1.0 4T_disk_2 2b06a9fa-901c-4b66-bfdc-8c7e4a09f21f 3.4T 0% /mnt/disk4

 - A strace of the application having a problem:
   Unable to provide due to how the command was run (scheduler)
 - strace of mergerfs while app tried to do it's thing:
 (logfile was too large, had to zip it)
[mergerfs.trace.zip](https://github.com/trapexit/mergerfs/files/13847594/mergerfs.trace.zip)

**Additional context**

My NFS export:
`/Storage *(rw,sync,fsid=0,no_root_squash,no_subtree_check,crossmnt)`

All disks have gone through a long selftest using smartctl and report no problems. Example output of the first 20TB disk:

SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

1 Extended offline Completed without error 00% 726 -

shocker2 commented 9 months ago

Seems that version is not shown now:

 # mergerfs -V
mergerfs vunknown
trapexit commented 9 months ago

How are you building it?

shocker2 commented 9 months ago

make uninstall; make; sudo make install

trapexit commented 9 months ago

I mean... what are you downloading? You shouldn't be using the automatic "source code" download from github. You need to pull from git or use https://github.com/trapexit/mergerfs/releases/download/2.40.1/mergerfs-2.40.1.tar.gz

shocker2 commented 9 months ago
wget https://github.com/trapexit/mergerfs/archive/refs/tags/2.40.1.zip
unzip 2.40.1.zip
cd mergerfs-2.40.1;
make uninstall
make
sudo make install
shocker2 commented 9 months ago
~/mergerfs-2.40.1 # cat VERSION 
unknown
trapexit commented 9 months ago

As I said don't use the auto generated source zip from github.

https://github.com/trapexit/mergerfs?tab=readme-ov-file#build

Use https://github.com/trapexit/mergerfs/releases/download/2.40.1/mergerfs-2.40.1.tar.gz

What OS are you on? If redhat or debian/ubuntu you can just use my packages.

shocker2 commented 9 months ago

Sorry missed that. I have installed it from your link and it's working fine now. I'm running it with openSuse 15.4.

shocker2 commented 8 months ago

Just a feedback after 3 days of usage. During this time I had two transfers ongoing with total write on mergerfs ±300MB/s and it's working great. No stale, no crash no performance degradation, thank you @trapexit for this awesome fix!

Janbong commented 8 months ago

Same here. Problem is completely gone after updating and configuring the new setting. Thanks @trapexit for the quick handling of this issue.

trapexit commented 8 months ago

What "new setting"? export-support should not be set to anything but true (the default) for usage with NFS. As the docs now say that option is purely for debugging.

shocker2 commented 8 months ago

Just to get the final feedback, seems that this is solved and stable. I've been monitoring the NFS mounts every minute since, and I'm migrating data 24/7 with ~300MB/s since (I do have to transfer ~700TB). I believe this can be closed. Thank you once again for this awesome fix!

One more off-topic question, I've seen the new kernel v6.9 announcement with FUSE pass-throug. While I do know MergerFS will support it and has been requested for a while, will this affect this patch for NFS applied on v2.40.1?

trapexit commented 8 months ago

Not sure what you're asking. Will the passthrough feature impact the NFS issue? No. Why would it?

shocker2 commented 8 months ago

Not sure what you're asking. Will the passthrough feature impact the NFS issue? No. Why would it?

Not the feature itself but the new kernel changes. I do know that this was related to a kernel change on 5.14 and I was wondering if this will impact the work-around that you have applied :).

trapexit commented 8 months ago

No, it should have zero relation to one another.

bennydiamond commented 7 months ago

I just want to report updating mergerfs from version 2.35.x to 2.40.2 fixed my issue. I too have a OMV6 VM running in Proxmox and exporting NFS shares on a internal linux bridge. I was having tons of issues regarding out of band changes and stale file handles. My system would collapse in under 18 hours, sometimes a lot less. After the update, no such issue for 2 days now.

Here's are my extra options for my mergerfs volume in OMV6's UI defaults,cache.files=auto-full,ignorepponrename=true,dropcacheonclose=true,inodecalc=path-hash,noforget

Here are my extra options for my mergerfs' NFS export: subtree_check,insecure,anonuid=999,anongid=100,no_root_squash,fsid=a5fc31254ce8448783212d2b077d2190

I will report further if things change but so far everything seems rock solid.