bazelbuild / sandboxfs

A virtual file system for sandboxing
Apache License 2.0
372 stars 39 forks source link

Directory becomes temporarily inaccessible after it has been moved #92

Closed tkuosmanen closed 4 years ago

tkuosmanen commented 4 years ago

When moving directories inside a sandbox, it takes a few seconds for the directories to become accessible. Directories are listed, but ls cannot stat() the directory entries.

For reproducing the issue, Bash auto completion has to be used when typing the directory names. By copy-pasting the commands as is, the directory that gets moved is immediately accessible.

Terminal 1:

$ mkdir -p ./out ./new_sandbox
$ sandboxfs --version
sandboxfs 0.1.1
$ sandboxfs --mapping=ro:/:/ --mapping=rw:/out:$PWD/out ./new_sandbox
...

Terminal 2:

$ cd new_sandbox/out/
$ mkdir target to_be_moved
$ mv to_be_moved/ target/  # <- Use TAB here for auto completion
$ while sleep 1; do date && ls -l target; done
Wed Feb  5 13:26:07 EET 2020
ls: cannot access 'target/to_be_moved': No such file or directory
total 0
d????????? ? ? ? ?            ? to_be_moved
Wed Feb  5 13:26:08 EET 2020
ls: cannot access 'target/to_be_moved': No such file or directory
total 0
d????????? ? ? ? ?            ? to_be_moved

...

Wed Feb  5 13:26:52 EET 2020
total 0
drwxrwxr-x 2 tkuosmanen tkuosmanen 2 Feb  5 13:25 to_be_moved

In the example above, it took roughly 45 seconds for the directory to become accessible.

This was run on Ubuntu 16.04 workstation with 4.4.0-133-generic kernel and ext4 root file system. Sandboxfs is pre-built release 0.1.1 from sandboxfs-0.1.1-20191024-linux-x86_64.tgz.

The issue has been reproduced also on Ubuntu 18.04 workstation with 4.15.0-72-generic kernel.

jmmv commented 4 years ago

Huh! Funny that you report this issue just today.

I've spent all day troubleshooting another problem I encountered and found a pretty serious bug in readdir (where we recreate nodes for directories every time). I suspect the fix to that, which I'm testing right now, will also resolve this.

jmmv commented 4 years ago

Alright. The problem doesn't reproduce on macOS as you describe it because of, I believe, differences in OSXFUSE. But I could reproduce it on Linux; thanks for the report.

This is, indeed, the same problem I found earlier today, which manifested in a different manner but also involved a rename along with reading the directory contents (what you triggered with tab completion). The problem was that because we don't cache directory nodes, readdir was incorrectly regenerating new nodes for subdirectories every time (but was doing the correct thing for files).

I've just pushed a fix to Travis for testing and will merge when tests pass.

tkuosmanen commented 4 years ago

Thanks for your extremely quick response to the issue!

I built the latest top of tree (855834e) and it fixes the issue described in my initial posting.

However, there seems to be a new issue with files located in directories that get moved. Below is an error case that I can reliably reproduce with my own build of 855834e:

Terminal 1:

mkdir -p ./out ./new_sandbox
sandboxfs --mapping=ro:/:/ --mapping=rw:/out:$PWD/out ./new_sandbox
...

Terminal 2:

$ cd new_sandbox/out/
$ mkdir target to_be_moved
$ echo dummy > to_be_moved/dummy.txt
$ mv to_be_moved/ target/
$ echo "contents of target:" && ls -l target && echo "contents of target/to_be_moved:" && ls -l target/to_be_moved
contents of target:
total 0
drwxrwxr-x 2 tkuosmanen tkuosmanen 2 Feb  7 13:11 to_be_moved
contents of target/to_be_moved:
ls: cannot access 'target/to_be_moved/dummy.txt': No such file or directory
total 0
-????????? ? ? ? ?            ? dummy.txt

There's no need to use TAB auto-completion to trigger the issue. In addition, target/to_be_moved/dummy.txt is stuck in a bad state. Waiting for a minute or two does not help in this case.

jmmv commented 4 years ago

Would you mind filing a separate bug for this problem please? This issue is now closed and referenced in the commit/changelog already so it'd be confusing to talk about its number for a different problem.

I think I know what's happening and it's not the same root cause. (During a directory rename, we update the inode cache for the directory itself... but not for any of its contents.) Wow, I'm surprised these obvious problems haven't shown up until now, as I've been (ab)using sandboxfs quite a bit with Bazel builds or with pkg_comp builds. But I think OSXFUSE and Linux's FUSE behave quite differently in these cases, and I have done very little testing on the latter.

tkuosmanen commented 4 years ago

Sure thing. Filed https://github.com/bazelbuild/sandboxfs/issues/94.