jacereda / fsatrace

Filesystem access tracer
ISC License
81 stars 12 forks source link

Tracing mkdir syscalls #48

Open erickt opened 2 years ago

erickt commented 2 years ago

fsatrace does not appear to trace the mkdir calls, which can create unexpected traces when working with temporary directories. Consider the Rust library tempfile. Here's a simple case where we create a temporary directory, which in turn creates a subdirectory, with a single file in it:

fn main() {
    let tmp = tempfile::TempDir::new().unwrap();
    let dir = tmp.path().join("dir");
    std::fs::create_dir(&dir).unwrap();
    std::fs::write(dir.join("hello"), b"hello").unwrap();
}

This produces the following trace:

r|/the-binary
w|/tmp/.tmp5JZCc9/dir/hello
r|/tmp/.tmp5JZCc9
r|/tmp/.tmp5JZCc9/dir
d|/tmp/.tmp5JZCc9/dir/hello
d|/tmp/.tmp5JZCc9/dir
d|/tmp/.tmp5JZCc9

We're not observing the creation of tmp/.tmp5JZCc9/dir, so when tempfile starts recursively deleting the temporary directory, the read call of tmp/.tmp5JZCc9/dir appears to be a unique access, even though the program fully created and cleaned up all these acceses.

To handle this, users of fsatrace could try to infer that these traces correspond to a directory tree by looking for successful accesses to subdirectories, but that won't work if we just create a directory and don't try to use it. For example, if we modify the previous code to remove the file write:

fn main() {
    let tmp = tempfile::TempDir::new().unwrap();
    let dir = tmp.path().join("dir");
    std::fs::create_dir(&dir).unwrap();
}

We will end up with this stream, that appears to access a file that wasn't created by the program:

r|/the-binary
r|/tmp/.tmpCSPGtB
r|/tmp/.tmpCSPGtB/dir
d|/tmp/.tmpCSPGtB/dir
d|/tmp/.tmpCSPGtB

I'd imagine that tracing the mkdir syscalls would add a w|/tmp/.tmpCSPGtB/dir event, which would allow us to infer that the program fully handled all these directory or file accesses.