Permissions inconsistent with actual filesystem

winterqt commented 11 months ago

Describe the bug

When using mergerfs to access a file, I sometimes get EACCESS one second, and then it works fine the next read, with no changes in permissions in between. I cannot reproduce this when accessing the drive in question manually, but the strace doesn't agree with me (see below).

To Reproduce

Create two users, and two groups: a and b.
Add a to the group b.
Create a large file as b:b (I cannot reproduce this on smaller files, e.g. echo foo > bar), and chmod 640.
cat bar | head -c 1 as a a lot of times. You'll see the content one execution, and then a permission denied error the next, etc. It's not consistent.

Expected behavior

For the permissions to work properly, as it does when directly accessing the file in question outside of the pool.

System information:

OS, kernel version: Linux 6.1.43 #1-NixOS SMP PREEMPT_DYNAMIC Thu Aug 3 08:24:19 UTC 2023 x86_64 GNU/Linux
mergerfs version: 2.36.0
mergerfs settings: cache.files=partial,dropcacheonclose=true,category.create=mfs,noatime
List of drives, filesystems, & sizes: will provide on request, they're all ext4
A strace of the application having a problem: openat(AT_FDCWD, "/pool/file", O_RDONLY) = -1 EACCES (Permission denied)
strace of mergerfs while app tried to do it's thing: openat(AT_FDCWD, "/drive/file", O_RDONLY|O_LARGEFILE) = -1 EACCES (Permission denied)

Additional context

I'm really not sure what's going on here. The straces are pretty clear that the its the filesystem on the drive that's screwing up, but I cannot reproduce this at all on when directly accessing the file on the drive...

If you need the rest of the straces (though I don't see how it'd be relevant here), let me know.

winterqt commented 11 months ago

I can actually reliably reproduce this with another user on this machine (that is, I consistently get EACCESS from the pool, but it works fine from the drive directly). Syscalls are the same, though... 🤔

trapexit commented 11 months ago

Are you interacting with the filesystem before doing all that?

mergerfs has a supplemental group cache because the cost of looking up supplemental groups is pretty expensive.

https://github.com/trapexit/mergerfs#supplemental-user-groups

And every thread has its own cache. So my bet is you have triggered the cache at different states of setup and so you get different results. Restart mergerfs and I would suspect it works as expected.

It is on my todo list to add an occasional timing out of the caches but right now it doesn't do that since changing of groups is pretty rare. The description there needs some updating but I was holding off till I added a timeout to the cache. At the end of the day though it will never be 100% because there is no way to know the grouping changed to invalidate and it is too expensive to check regularly. Best I can really do is a general timeout and/or manual flushing.

winterqt commented 11 months ago

Thanks, it looks like this was the culprit.

On an unrelated note (unless you'd rather me ask in a new issue): I copied a large amount of data to this pool using rsync, with the above mergerfs settings, and two drives of equal capacity. Somehow, an entire directory's worth (let's call it /a) of files ended up on just one drive, but an /a directory on the second drive was created, which is empty. Does this sound normal? I don't understand how such a thing would happen -- why would an empty /a directory be created, if no file was put there?

I don't know why this would be the case, it's not like rsync is parallel. Furthermore, if I just mkdir /pool/b, /b only gets created on one drive. So... not sure why this would be happening.

(Hopefully that makes sense; it's not that big of an issue, just a curiosity if anything.)

trapexit commented 11 months ago

Without a more explicit description of exactly what the original state is and what actions are taken I can't really speak to it.

Everything works as described in the docs. If you have category.create=mfs then the branch with the most free space is chosen for the function in question.

trapexit / mergerfs

Permissions inconsistent with actual filesystem #1223