untitaker / rust-atomicwrites

Atomic file-writes.
https://docs.rs/crate/atomicwrites
MIT License
91 stars 22 forks source link

Making Linux directory fsync optional? #45

Open nyanpasu64 opened 3 years ago

nyanpasu64 commented 3 years ago

I feel that it's a good idea for the API to make fsyncing the directory of a file optional on Linux. Currently it's unconditionally performed on Linux. But flushing the directory results in a speed penalty (but I don't know how much). Also there is no analogue on Windows, where no API I've seen allows for flushing a directory to disk.

I feel it's useful to support only fsyncing the temporary file but not the directory, and would use such an option in my atomic-writing code if available.

For examples of APIs which support enabling or disabling directory fsync, glib's atomic-write API makes directory fsync optional (link), and they explain it's necessary for durability (the write is guaranteed to be visible if the system crashes) but not consistency (no corrupted data is visible after a system crash).

sunshowers commented 2 years ago

@untitaker: I think one use case where disabling fsync makes sense is when you're writing out many files atomically at the same time: in that case you want to ensure other applications don't see torn state, but only issue fsyncs after a certain number of files are written out (or one giant one after all operations are complete). Does that make sense?

untitaker commented 2 years ago

@sunshowers If you don't issue fsync, applications can still see intermediate state.

nyanpasu64 commented 2 years ago

To my understanding, fsync doesn't affect what other apps see, but only file/data persistence after system crashes.

I assume when writing multiple files, other apps can see them whether you fsync or not. I don't know if atomically writing multiple files is possible. I suppose renaming a directory is possible, but likely undesirable since it will fail or cause trouble if another program (eg. a shell) has the directory open.

untitaker commented 2 years ago

yup! fsync is totally not required for consistency (nor is it useful to enforce consistency of multiple files by skipping fsync), but if you do not care about durability, i think this crate is not particularly useful and you might as well call rename yourself.

if you want to atomically make directories appear/disappear, renaming (or symlinking) them is fine in the same way it is for files (though you can't use this crate for that)

sorry for the slow response @sunshowers hope that still helps

sunshowers commented 2 years ago

What does "applications can see intermediate state" mean here? If you're doing, say a fresh checkout of the Linux kernel, you write ~100k files to disk using atomic renames. Issuing an fsync after every single file gets written out seems to be quite wasteful (and also durability isn't that important since the state is in the repository anyway), so you'll instead want to batch up fsyncs. Would you say that atomicwrites is not meant to serve that use case? And if so, would it be OK to fork this crate?

untitaker commented 2 years ago

what i'm saying is that on a git checkout, other processes can observe a subset of files that are being checked out instead of all or none of them. i think i may have misunderstood your post and thought you were talking about this sort of torn state, since we're talking about directory fsync

if you take away the durability part of this crate, really all that's left is the tmpfile + rename. why not use tempfile directly for that? https://github.com/untitaker/rust-atomicwrites#alternatives

but yes in any case it's totally fine to fork

untitaker commented 2 years ago

oh nevermind, i just re-read the OP. you still probably want the file fsync. hmm. it probably makes sense to add this option then. patches welcome.

nyanpasu64 commented 2 years ago

FYI I might not submit a patch soon, I'm currently working on non-Rust projects.

sunshowers commented 2 years ago

what i'm saying is that on a git checkout, other processes can observe a subset of files that are being checked out instead of all or none of them. i think i may have misunderstood your post and thought you were talking about this sort of torn state, since we're talking about directory fsync

Ah yeah, not the torn state of a subset of files being checked out (not possible to handle with unix really), but the torn state of a single file that's half-written out.

untitaker commented 2 days ago

I wonder what y'all would think about a batch API for atomic writes. fsync is necessary eventually, but maybe not after every file written in a folder. if 100 files are written in a folder, the fsync overhead could be amortized.

I'm mostly trying to think of ways to make this library as safe as possible for users. After all this time I feel that if you can trade speed for safety because you know what you're doing, you're better off using the syscalls directly.