trapexit / mergerfs

a featureful union filesystem
http://spawn.link
Other
4.29k stars 174 forks source link

Option to disable merger mount if one branch is unavailable #691

Open RomLecat opened 4 years ago

RomLecat commented 4 years ago

General description

An option do disable a mergerfs mount when a specific branch is unavailable would be great.

Expected behavior

For example, see this command: mergerfs -o async_read=false,use_ino,allow_other,auto_cache,func.getattr=newest,category.action=all,category.create=ff,threads=12 /mnt/one:/mnt/two /mnt/merger

It would be great to have an option, so if /mnt/one disappears (for example, if it's a network share and the server goes down), then access to /mnt/merger would throw an error.

Why ? Because, for example, I have a Plex library binded to a MergerFS mount that merges a CIFS share and an rclone mount. If my CIFS share goes down for whatever reason and Plex refreshes its metadata, then it'll delete everything except what's inside my rclone mount. It is an issue with big libraries as you have to refresh all of its content (which takes time) then remap if you need to on Plex, etc... I'd rather have my merger volume unavailable if my samba share goes down than having my library flushed.

Actual behavior

Using the above command, then if /mnt/one goes down, /mnt/merger is working and only showing /mnt/two content.

System information

Please provide as much of the following information as possible:

trapexit commented 4 years ago

Need to define what "disappear" means. mergerfs is primarily just a proxy. It doesn't monitor the underlying branches. In what way does the CIFS mount respond when it fails? If it just disappears that's difficult to catch. Being empty is legit but recognizing it is difficult. It'd require a lookup on every failed file lookup (which happens all the time). Adding something like inotify would be platform specific but also change the relationship with mergerfs and the branches. Currently they aren't in use if not in use by an app through mergerfs. It'd force mergerfs to take a watch out on them which means they couldn't be umounted without removing them from mergerfs first.

What do you mean by "disable"? Return some specific error? ESTALE? ENOTCONN?

RomLecat commented 4 years ago

CIFS on Linux still has some pretty weird behaviour. When it's unavailable, I've seen two things happening while performing an ls, for instance (in those examples, /path is the CIFS mount point):

It could also be empty, not if the remote is unavailable, but if it failed to mount (for example, if network on a secondary NIC did not came up in time, systemd will attempt to mount it but will fail, when using autofs/systemd automount).

Also, inotify is probably not an option here since it's not supported on CIFS shares anyway.

The best thing would be to handle those three cases. I'm not sure of which error would be the best, but the general idea would be to prevent any access to /path if some branch failed. I'm not sure which error returns "Resource temporarily unavailable" on ls, but this one could be nice, I guess (you probably know better than me which error is the most appropriate for this kind of situation).

trapexit commented 4 years ago

I'm not sure there is a good way to catch a generic ENOENT ("No such file or directory") as that's the error returned any time something is not there... and mergerfs is just looking for the file asked for. If you have 2 branches and 1 has a file X... when searching for the file naturally the OS will return ENOENT for the branch that doesn't have it. And if it's blocking or acting funny otherwise it's going to block mergerfs. It's at the whim of the OS. Right now it ignores errors on branch scanning unless it affects everything but I could make it optionally return an error if it got a non standard error like resource temporarily unavailable or ENOTCONN or ESTALE which are common for other network filesystems.

As for mounting something not there... how will I know it's not mounted? The branch filesystem ID is the same as the directory above it? How should it act? When would it check? At mount time is possible but then what would it do? Fail to mount. This is tricky stuff to make generic. Especially since some of the behavior is perfectly legit in other scenarios.

RomLecat commented 4 years ago

To be clear, the ENOENT error does not only happens to file/directories belonging to the branch, but also to the mount point of the branch. For example, if I have a /mnt/branch folder with a mounted CIFS share on it, then trying to ls /mnt/branch will also return ENOENT. Would it be possible to create a check for that ?

The optional error when encountering ENOTCONN/ESTALE would also be great.

For the last case, I'd imagine it's pretty much the same case as being empty. Ideally, an option to disable the mergerfs mount if one of branch is empty would be nice, but you said it's difficult to recognize it.

trapexit commented 4 years ago

As in /mnt/branch exists (it would have to to mount) but it returns ENOENT even though it's there?

Regardless... noticing that it's gone means it must be checked at some point. To check needs some trigger and if the trigger is ENOENT somewhere else (say /mnt/branch/foo/bar) then that would lead to many additional stat calls (the way you determine if something exists). It's possible to do just will increase the cost to all policies.

Empty is a legitimate situation and can only known by actively checking. That check would need to be like above... every function that returns an error has to check /mnt/branch for link counts. Complicating that further is that the underlying directory can have anything in it. That's totally fine. And the files could change while mergerfs is mounted on top. Yes, it's uncommon but it's possible and legit. So I'd need to check for the filesystem ID for change which will need active checking.

I'll have to investigate exactly what cifs does. This is non-trivial.

RomLecat commented 4 years ago

As in /mnt/branch exists (it would have to to mount) but it returns ENOENT even though it's there?

Yes, the folder does exists and is where the mount is done, but if the mount is unavailable, then trying to ls does return ENOENT (I know, weird). Pretty much like if it checked the mount but not the folder itself.

Thanks a lot for your fast answers on your assistance on this, I really appreciate it! MergerFS is truly a great piece of software.

grokskookum commented 4 years ago

I tried asking for this feature on reddit and was rebuffed with the standard. "I don't follow, and when I do, this isn't an error and when it is, then it has nothing to do with mergerfs, etc."

I too enjoy this program immensely, but it is tiring getting brushed off like this... I am like, do you even use your own program? ... anyways.

https://www.reddit.com/r/DataHoarder/comments/dhsyu2/mergerfs_notification_when_a_member_is_missing/

"It doesn't. 1) There isn't a way AFAIK to find that information out without polling. 2) There is no reason for me to make it poll because the data is always queried when needed or the information is irrelevant to its function. It's completely valid for a mount to be unmounted or be remounted read only or run out of space. Besides respond as the API would in the same situation without mergerfs I'm not really sure what you're looking for." ~trapexit

anyways, as with most of the issues I run into on mergerfs I just deal with them outside of the program. in this instance, I just run a script to check the existence of specific files that only exist on each branch through the mount point and if one is missing then I un-mount the whole thing and barf across the screen.

trapexit commented 4 years ago

I wasn't brushing you off. You asked about visually notifying when something disappears. That's different from the above. In this case there are errors which can be caught but something not existing, as I mentioned in the thread you mentioned as well as above, is not easy to really know, can be costly to determine, and isn't precise. Filesystems are not consistent in how they fail. Managing that failure in a predictable way is difficult if not impossible in some cases. Some filesystem fails can freeze up the whole system. Network filesystems act differently from local.

If I'm to add anything to mergerfs in this space it needs to be generic enough to not cause false negatives. Watching files could be done but I'd prefer to add a way to easily cause mergerfs to go into a broken mode and monitor outside.

You asked for a specific feature rather than explaining the general problem space. This muddies the situation. I thought you literally just wanted a notification of a mount disappearing which isn't a thing that IMO should exist in mergerfs as it's a generic thing. Making mergerfs "break" when something broken is noticed is very different.

grokskookum commented 4 years ago

quote from my reddit post:

"I am just wanting a runtime status of mergerfs, which paths are actively being merged, if any of them are read-only or out of space. I understand that mergerfs is a best effort union solution, but I would like for it to fail loudly instead of trudging on when there is a problem. failing that, I'd at least like to get notified if a branch drops or goes read-only."

trapexit commented 4 years ago

Yes. read-only, out of space are generic things. I don't see why that is a mergerfs thing. It's not about being "best effort". It's about what is appropriate to be put into the product. I'm not aware of any filesystem that actively monitors such things. If ext4 finds an error it will optionally turn read-only and at best you get a log message in the kernel. I'm unaware of any other filesystem, userland or kernel space, that actively reports out of space or turning readonly.

Read-only is a perfectly valid behavior and it's not possible to be determined without writing to the drive. A statvfs won't return that it is readonly unless explicitly mounted readonly. If a drive flips readonly the only way to check is to write to it.

grokskookum commented 4 years ago

There are a lot of tunables in mergerfs. this is a case where there is actual harm done when one branch fails to mount or becomes unavailable. Its like talking to a fence post with you... just add a --die-quickly-when-branch-encounters-issue, how is that so incomprehensible to you? I mean honestly I don't care if you add the feature, I just wanted to keep track of the times people ask for this as a personal hobby.

for your edification:

basics of unix philosophy: Rule of Repair: When you must fail, fail noisily and as soon as possible. Rule of Economy: Programmer time is expensive; conserve it in preference to machine time. Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.

trapexit commented 4 years ago

how is that so incomprehensible to you?

You make this out to be trivial. It is not. I at no time said I was opposed to a feature in this space. You didn't ask for the above feature. You very explicitly asked about notification of things like read-only and out of space which are things better left to system monitoring tools as they are generic features.

Rule of Repair: When you must fail, fail noisily and as soon as possible.

Define failure. Explicitly. Your examples, as I pointed out, are not so simple or consistent. You ignoring that fact doesn't make it false.

grokskookum commented 4 years ago

I give up, peace.

trapexit commented 4 years ago

Given your attitude I assume you know more about filesystems and/or a way to accomplish your requests in a way that I am missing. Therefore I invite you to explain to me how to discover these situations in a clean and foolproof way and I'm happy to add it to mergerfs. Or feel free to submit a PR with said changes.

trapexit commented 4 years ago

@Hakujou I'll need to see the precise errors that CIFS, NFS, etc. throw in these situations to see how practical it is and where it could be managed. As I tried to explain some of these "errors" are common and valid and not indicative of a mount wide issue. It's also not been said what the "fix" is. Manual? Continue to try every function when requested and if no errors occur stop erroring? What if you umount cifs... the error will go away but now you're back to having nothing in that part of the pool.

Even if some of these are practical it feels like it'd need to be a number of individual settings. Some people simply don't want a bad branch to cause full breakage. That's a feature. Some people mount over top existing paths with content so "empty" isn't having nothing. There are a lot of permutations.

RomLecat commented 4 years ago

Okay, let me know if you need my help on anything, I'll try to help the best I can. For the fix, I'd assume the best way would be to "try every function when requested and if no errors occur stop erroring" IMHO, but as it's a design decision your optinion is probably better than mine :)

Sure, this behaviour is probably not wanted by everyone, keeping it as an option (disabled by default) would be perfect for those users.

trapexit commented 4 years ago

My point was that this might need to be multiple features. Each with their own setting. "failure" isn't just 1 thing and some might want one thing and not the other. It depends on everything I mentioned above. Remember that the filesystem is made up of all those functions listed in the docs. Each one behaves differently. Has different errors, has different error conditions, etc. Some use policy, some don't. A write failure isn't the same as a read failure. Failure is not always consistent. Etc.

RomLecat commented 4 years ago

Oh yes, that makes sense, indeed.

trapexit commented 4 years ago

Did you collect specific issues that you think would be used here? Or how mergerfs would respond? Are you positive Plex short circuits deletes if the whole mount is gone or returns an error? I just turned off deletes and it works fine. You just have to run the cleanup on occasion to manage actual deletes.

I think the most practical solution is to create a watchdog behavior and you could create any logic you want externally and if you don't update the watchdog value mergerfs returns an error for certain functions but that needs to be explicitly defined. I don' want to implement something that doesn't work because the problem and possible responses haven't been fleshed out.

RomLecat commented 4 years ago

I'm using Plex in docker which is kinda special for this specific case, as it won't start at all if the mount is gone (as the mountpoint cannot be found). If the mountpoint is gone while Docker is already started, Plex doesn't remove elements from the catalog. Delete is also turned off on my side but it's a different issue. Plex does not delete any file by himself in any case regarding this issue, the problem is that when the next catalog scan is performed, Plex removes all deleted files from its catalog (which makes sense, since the files are gone, so this is normal behaviour for Plex, it won't keep catalog entries for removed files).

I agree on the watchdog, it looks like an ideal solution. I tried to work on something based on a witness file (like stop the container immediately if a file is gone, named .witness, on MergerFS mount), but having a dead CIFS path causes a lot of issues on systemd automount (and probably regular mounts too), the process freezes anytime it tries to access to CIFS mountpoint, probably waiting for a (long) timeout.

trapexit commented 4 years ago

the problem is that when the next catalog scan is performed, Plex removes all deleted files from its catalog (which makes sense, since the files are gone, so this is normal behaviour for Plex, it won't keep catalog entries for removed files).

I understand the situation but Plex has the option for it not to remove the metadata for the files it sees missing. I've used that feature for years for this very purpose.

image

I tried to work on something based on a witness file (like stop the container immediately if a file is gone, named .witness, on MergerFS mount), but having a dead CIFS path causes a lot of issues on systemd automount (and probably regular mounts too), the process freezes anytime it tries to access to CIFS mountpoint, probably waiting for a (long) timeout.

That would happen to mergerfs too if it was doing the check. This is why I as saying that this whole topic is more complicated than people often make it out to be. There are many error conditions and some of them are blocking. While the error wouldn't make much sense... today (and for the past few years the policy has been available) you could change any function (such as open) to erofs when you detect a problem. But that may or may not be good enough to manage any particular situation. Again... knowing exactly how to behave when a watchdog was triggered is critical. Most of the conversation prior was far too high level. I don't really care about what "unavailable" means if I'm not doing it but I do need to know what "disable" means.

RomLecat commented 4 years ago

the problem is that when the next catalog scan is performed, Plex removes all deleted files from its catalog (which makes sense, since the files are gone, so this is normal behaviour for Plex, it won't keep catalog entries for removed files).

I understand the situation but Plex has the option for it not to remove the metadata for the files it sees missing. I've used that feature for years for this very purpose.

image

Wow, I never actually understood what this option meant... thanks a lot for that, I just disabled it!

I tried to work on something based on a witness file (like stop the container immediately if a file is gone, named .witness, on MergerFS mount), but having a dead CIFS path causes a lot of issues on systemd automount (and probably regular mounts too), the process freezes anytime it tries to access to CIFS mountpoint, probably waiting for a (long) timeout.

That would happen to mergerfs too if it was doing the check. This is why I as saying that this whole topic is more complicated than people often make it out to be. There are many error conditions and some of them are blocking. While the error wouldn't make much sense... today (and for the past few years the policy has been available) you could change any function (such as open) to erofs when you detect a problem. But that may or may not be good enough to manage any particular situation. Again... knowing exactly how to behave when a watchdog was triggered is critical. Most of the conversation prior was far too high level. I don't really care about what "unavailable" means if I'm not doing it but I do need to know what "disable" means.

I agree, this issue might happen to MergerFS too, there's probably a parameter to configure timeout for samba shares, maybe in /proc. I never really took time to check it. By "disable", I think the best option would be either ENOENT on the mountpoint itself, or EIO if any file within the mountpoint is accessed. EROFS would still make any software able to read within the mountpoint and therefore see any file removals.

trapexit commented 4 years ago

By "disable", I think the best option would be either ENOENT on the mountpoint itself, or EIO if any file within the mountpoint is accessed.

What do you mean exactly? For a stat? A readdir? Both? Everything? mergerfs can't make the mount disappear. It can only respond to requests from the kernel. Those largely align with all the functions defined in the docs. There are a few ways to interact with the filesystem. Software can use readdir to scan for files and then act on them. It can call functions like open, unlink, stat which take full paths that it already knows of. Every piece of software interacts in it's own way and responds to different errors in their own way. For all I know one project like Plex may ignore everything if it gets an error but it's just as possible that something else might consider an error (or a specific error) the same as nothing being there. And having something that would accommodate multiple pieces of software would be unfun. Doing something that is generic and useful might not be practical. If plex is pointing at the directory and gets a ENOENT when it looks... I would think it'd just consider it missing... cause that's what ENOENT means.

RomLecat commented 4 years ago

I meant for everything, but indeed some software might not react properly to FS errors. ENOENT would have worked for my case because I'm using Docker, and Docker doesn't start any container mapped to a non-existing folder, but indeed it only fits my use-case, not everything.

TBH I don't really have any better idea right now. EIO is the best case IMO, because if some software takes an EIO as non-existing file, that's on them, a FS error should not be treated as non-existing file, honestly. Unfortunately I don't really have a more versatile and convenient option to offer.

trapexit commented 4 years ago

Docker doesn't start any container mapped to a non-existing folder, but indeed it only fits my use-case, not everything.

Yes, but you can have errors while Plex is running. I've never had a situation where a remote filesystem happens to be unavailable when the software starts but it happens regularly for many different reasons when it is running. It's really no different from a drive dying.

I think if this is still a feature of interest then the best I could do is make an externally controlled watchdog API where someone can do whatever they want. mergerfs will check if the watchdog is triggered on certain functions it can return a user defined errno value. My fear though is that it could lead to data corruption. If it returns EIO on stats... what happens to software in the middle of writing a file and stat's it? Or renames it? Or whatever? I really would not want to try to keep track of every file in use just for this very niche feature. I don't need to do it today.

RomLecat commented 4 years ago

Plex does seem to handle it nicely, but that might be Docker doing something in the middle. I'm not quite sure.

That would allow each user to tweak it according to their need. Regarding data corruption, I don't see this being really different from a dead drive or network mount going down? Shouldn't this be handled by the FS ?

trapexit commented 4 years ago

Regarding data corruption, I don't see this being really different from a dead drive or network mount going down? Shouldn't this be handled by the FS ?

But this isn't a drive dying or mount going down. If a bittorrent client or downloader of some sort or a copy is happening to other branches and the watchdog suddenly starts returning errors when the underlying drives are absolutely fine then you could get all kinds of weird or broken states. Well written software generally should be mostly stateless and atomic but that's definitely not always the case. mergerfs could just block till "fixed" but that has it's own risks and complications.

trapexit commented 4 years ago

Docker doesn't do anything special. It's just bind mounting paths. Nothing you can't do on your own. It just fails if the source or target path don't exist. That's just how mounting works. If Plex is running and does a scan and the data is missing... then your data will be removed from the DB.

RomLecat commented 4 years ago

Fair point, I don't see any solution to poorly written software. Blocking until fixed would mitigate but not solve the issue. What happens, for example, if the server goes down while MergerFS is waiting? However, it sounds acceptable to me if the option is not on by default, don't you think ? If the user decides to tweak MergerFS behaviour, he should know bad things could happen.

Plex doesn't remove the data when network share is directly mounted, but probably because it doesn't get ENOENT, but ENOTCONN or ESTALE (as I'm using CIFS share). Not sure how the mount bind reacts to that in top of it, tho.

trapexit commented 4 years ago

Fair point, I don't see any solution to poorly written software.

What I don't want to do is make it easier for people to corrupt their data. It's why I've been resistant to adding writing to multiple files for "striping". Error handling is non-trivial and some of the usecases people have described to me are more risky than alternative methods in accomplishing the same thing. Having the software itself manage these situations, like the Plex "Empty trash" thing, works a lot better.

What happens, for example, if the server goes down while MergerFS is waiting?

The kernel would ignore any response regarding the process or interrupt the request but mergerfs would be blocked and can only read so many requests. This is one thing I mean when saying it has it's own complications.

bind mount doesn't impact it at all. it's just an alias to another path.

I have some other things that are more important to look into but I can look into the watchdog. Will need to see how exactly some of this software reads and responds to the system.

maximuskowalski commented 4 years ago

@Hakujou I am looking for information on mergerfs with inotifywait and just thought I would let you know I made a very simple script to stop my plex / jellyfin / emby etc dockers by checking for existence of an anchor file last week. Script is here if it helps. Apologies if I have misunderstood what you want to do.

RomLecat commented 4 years ago

@maximuskowalski Thanks a lot for sharing! That's the kind of thing I'm looking for, indeed.

trapexit commented 4 years ago

Manipulating the Plex database seems a bit more risky to just enabling the Plex feature to ignore missing files till explicitly triggered.

Regardless, out of band orchestration is what I had mentioned originally. It's easy to create watchdogs that control behavior based on whatever it is you want and is completely independent from mergerfs.

Until someone can explicitly define the behavior they'd want from mergerfs, at a low level because that's all there is, I really can't do anything in this space. And I really believe this is dangerous and likely to result in data corruption. Returning some error while in the middle of writing a while would leave broken files around and the severity is entirely dependent on the program in question.