zargony / fuse-rs

Rust library for filesystems in userspace (FUSE)
MIT License
1.07k stars 130 forks source link

Filesystem is not fully up and running when spawn_mount returns #9

Open MicahChalmer opened 10 years ago

MicahChalmer commented 10 years ago

Consider this snippet from #8:

    let _session = fuse::spawn_mount(HelloFS, &mountpoint, []);

    // TODO: Without this, the test fails--the filesystem is not fully set up by the time the next
    // line tries to use it.  What guarantee should we make (if any) about this?  Should it be fully
    // ready before spawn_mount returns?
    std::io::timer::sleep(100);

    let hello_contents = File::open(&mountpoint.join("hello.txt")).read_to_end();
    assert_eq!(hello_contents, hello_world.as_bytes().into_owned());

without the sleep call, the test fails.

Even waiting for init to be called on the filesystem does not suffice. I tried using that to sync (the init call would send on a channel, with the test code receiving on the corresponding port in place of the sleep) and it still did not work without the additional sleep call.

I'm not even sure what guarantee should be in place--my instinct is to say that the FS should be usable when spawn_mount returns though.

zargony commented 10 years ago

I agree that the FS should be usable right after calling spawn_mount returns. I supposed that a filesystem is usable after the channel is opened to the kernel driver and requests are queued up as long as we don't read and answer them. However, maybe the kernel checks if initialization was done before sending any requests to us. That would require us to wait for FUSE_INIT to be done (like you tried). Maybe the kernel driver also needs some additional time after we replied to FUSE_INIT. I'm not sure how to catch that - maybe we need to wait for the first non-init request to happen (which doesn't sound very nice to me)

MicahChalmer commented 10 years ago

One more observation: this problem doesn't appear on my linux box. If I remove the call to sleep from the hello example (which was there to work around this issue) then on my linux machine it still consistently succeeds. I don't even have to wait for FUSE_INIT. However, even here I don't know if this timing is a guarantee that the FUSE developers intend to support, or if it's a race condition just so happens to come out the successful way on my linux machine.

OS X (and not linux) seems to always issue some requests as soon as the filesystem is mounted, so an OSX-only workaround in which we wait until the first non-init request would probably be relatively assured of not hanging. But in the end that's not much better than sleep--it papers over the problem without understanding what causes it in the first place.

I don't know how the kernel could "check if initialization was done" when it doesn't even get a reply back from FUSE_INIT. (In fact, we should probably make the init fn on Filesystem return unit instead of Result<(),c_int>.) I think some digging into osxfuse (and the linux version of fuse, for that matter) will be needed to resolve this.

zargony commented 10 years ago

I took a look at the osxfuse kernel driver to find out what happens, but unfortunately that didn't help much.

I can confirm that the kernel driver always does a statvfs on the newly mounted filesystem right after FUSE_INIT.

FUSE_INIT is a purely fuse-related mechanism to exchange supported version numbers between the kernel driver and the userspace library. The kernel driver sends FUSE_INIT during mounting of the fuse filesystem. Responding with an error to FUSE_INIT should result in failing to mount the filesystem.

The kernel driver provides a new filesystem type to the kernel that can be mounted like any other filesystem, but needs some special options when being mounted. Therefore fuse filesystems are not mounted with mount but with fusermount (mount_osxfusefs on OSX). It does the same as mount does, but ensures to send along some options like the fd of the userspace control channel).

So mounting a new fuse filesystem is the same as any other filesystem and the kernel in the end delegates it to the fuse_vfsop_mount function of the fuse kernel driver. This function parses the additional mount options, sets up some data structures and sends FUSE_INIT to the userspace. It then synchronously waits for the FUSE_INIT reply and does the statvfs. It finally returns success or an error to the kernel. If this function returns an error, then mounting fails (i.e. fusermount prints an error - or in our case, fuse_mount_compat25 returns -1 and errno is set the the error returned in fuse_vfsop_mount).

It seems that the Darwin kernel needs additional time after the return of fuse_vfsop_mount before the new mount point can be accessed, while Linux doesn't. This doesn't seem to be a problem in the fuse kernel driver since the fuse driver is ready to take requests after the mount. It seems to be related to some Darwin kernel internals that we unfortunately can't dig into.

zargony commented 10 years ago

I think I might have found the reason for this. I think it's caused by the FUSE userspace library. fuse_mount_compat25 forks to mount the filesystem but it doesn't care to waitpid() for the process to exit.

FUSE on Linux (mount.c) uses the mount() libc function to mount a filesystem and falls back to execute fusermount (and waiting for it to complete) if the kernel doesn't allow unprivileged mounts. OSXFUSE mounting (mount_darwin.c) seems to be derived from BSD (mount_bsd.c), which always executes mount_osxfusefs (mount_fusefs on BSD). Both do fork() and exec() but don't care to waitpid() for the forked process to end before returning. I.e. on OSX and BSD, fuse_mount_compat25 returns before the actual mount has been completed (and probably also don't pick up a possible mount failure caused by the kernel or FUSE_INIT).

MicahChalmer commented 10 years ago

So...given what you found out about why this problem occurs, the options for fixing this appear to be:

  1. Rewrite the fuse_mount_compat25 in rust, thus allowing us to wait on the child process. This would remove the dependence on the FUSE c client library entirely. It would need two completely different versions between the two OSes, and might bind the rust library to a specific kernel driver version. (I'm not sure how much the FUSE developers allowed for disparity between the FUSE library version and the kernel extension version--looking at the osxfuse code, it looks like it will error out if it does not get an exact version number match. This implies that they didn't intend the mounting interface between the C client lib and the kernel to be a stable API...)
  2. Sleep and periodically check if the FS is mounted before returning from spawn_mount. Not appealing in that this is a bunch of extra time and IO, and there may be plenty of reasons to spawn_mount without caring if the FS is available immediately.

We could also submit a PR to osxfuse to get it to waitpid on the process. That will fix the issue for newer versions of the library after the fix (if it gets accepted) but won't provide a way to work around it for older versions of osxfuse. Oddly enough, osxfuse actually does a double fork before exec'ing the mounting program--seeming to force it not to wait for the mount program to return. The reason why is not clear to me.

Or we could punt, and just document in spawn_mount that the FS may not be ready immediately when it returns. It might be good to provide an is_mounted method on the BackgroundSession so that user code can do a sleep-and-check if it needs to.

reem commented 8 years ago

I've hit this too, and have worked around it in my code in github.com/terminalcloud/tfs using a 1-second sleep after starting the fs, but this is unreliable (every once in a while it takes >1s). Seems like @MicahChalmer correctly recorded our options moving forward if we want to close this issue.

jlpell commented 6 years ago

I still have the same problem on gentoo linux amd64. The regular mount command works as intended, but spawn_mount does not work, regardless of how much time I wait.