Closed Byron closed 9 months ago
@Byron There's a better way to create tempfile that persists after writing is done on Linux using O_TMPFILE
.
While it only supports ext2, ext3, ext4, UDF, Minix, tmpfs
XFS (Linux 3.15), Btrfs (Linux 3.16), F2FS (Linux 3.16), ubifs (Linux 4.9), and needs either CAP_DAC_READ_SEARCH
or a procfs to be mounted, it's still a pretty good solution.
It prevents any other process from manipulating the file (except through procfs) until the caller process marked it as ready, and it removed the file on process exit even if the rust destructor is not called (e.g. panic during unwinding, calling exit
directly)
O_TMPFILE (since Linux 3.11)
Create an unnamed temporary regular file. The pathname
argument specifies a directory; an unnamed inode will be
created in that directory's filesystem. Anything written
to the resulting file will be lost when the last file
descriptor is closed, unless the file is given a name.
O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
and, optionally, O_EXCL. If O_EXCL is not specified, then
linkat(2) can be used to link the temporary file into the
filesystem, making it permanent, using code like the
following:
char path[PATH_MAX];
fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
S_IRUSR | S_IWUSR);
/* File I/O on 'fd'... */
linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH);
/* If the caller doesn't have the CAP_DAC_READ_SEARCH
capability (needed to use AT_EMPTY_PATH with linkat(2)),
and there is a proc(5) filesystem mounted, then the
linkat(2) call above can be replaced with:
snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
AT_SYMLINK_FOLLOW);
*/
In this case, the open() mode argument determines the file
permission mode, as with O_CREAT.
Specifying O_EXCL in conjunction with O_TMPFILE prevents a
temporary file from being linked into the filesystem in
the above manner. (Note that the meaning of O_EXCL in
this case is different from the meaning of O_EXCL
otherwise.)
There are two main use cases for O_TMPFILE:
• Improved tmpfile(3) functionality: race-free creation
of temporary files that (1) are automatically deleted
when closed; (2) can never be reached via any pathname;
(3) are not subject to symlink attacks; and (4) do not
require the caller to devise unique names.
• Creating a file that is initially invisible, which is
then populated with data and adjusted to have
appropriate filesystem attributes (fchown(2),
fchmod(2), fsetxattr(2), etc.) before being atomically
linked into the filesystem in a fully formed state
(using linkat(2) as described above).
O_TMPFILE requires support by the underlying filesystem;
only a subset of Linux filesystems provide that support.
In the initial implementation, support was provided in the
ext2, ext3, ext4, UDF, Minix, and tmpfs filesystems.
Support for other filesystems has subsequently been added
as follows: XFS (Linux 3.15); Btrfs (Linux 3.16); F2FS
(Linux 3.16); and ubifs (Linux 4.9)
@NobodyXu I agree if you're only target is Linux, but it's still useful to have a platform-independent way to do this (to, e.g., support MacOS which hasn't seen any filesystem API improvements in, basically ever).
Thanks! And yes, I prefer the final version (more invasive approach).
Thanks for the quick review! If CI passes I have addressed the windows portion of it, but will definitely need some guidance to deal with the directory. If you have ideas and want to tinker, please feel free to push into this branch - sometimes that's easiest.
@NobodyXu Thanks for sharing! It's definitely something to consider for the time when performance needs to be optimised further. Personally I am most interested in doing what Git does at first, which seems to be pretty much what's happening here.
I have added mode-support for directories thanks to your hint with rustix::fs::mkdir
and it appears to work. Further, trying to change permissions on unsupported platforms will result in an error that as well.
When CI is green it's ready for review, with the remark that I kept dir.rs
nimble, with a structure sufficient for the task even though it's not quite the same as in file
.
So... somehow I missed this but: https://doc.rust-lang.org/std/fs/struct.DirBuilder.html#method.mode. We don't need to use rustix here.
So... somehow I missed this but: https://doc.rust-lang.org/std/fs/struct.DirBuilder.html#method.mode. We don't need to use rustix here.
Oh my, I missed that too! I clearly was biased when researching this topic and "didn't believe in mode being available for directories" 🤦♂️. Lesson learned, and the code was adjusted to something much better.
I made a note that recommends to not recursively create the directory by default but leave that to a separate PR along with a new flag in the builder maybe to control this, to keep the previous behaviour exactly the same. Right now, this still is the case.
If CI is green, this PR should be ready for re-review.
And with those changes, this should be good to go. Thanks!
Thanks a lot for the QA and sorry for the sloppiness, at least 2 of 3 I should have caught. But here we are, and I think that's the version that is ready to go.
PS: A new release with these changes would help, as that way everything can trickle downstream to StackedGit. Thanks again.
Thanks a lot for the QA and sorry for the sloppiness, at least 2 of 3 I should have caught.
Don't be. That'll just make me feel worse about my PRs and that's why we have review 😆.
PS: A new release with these changes would help, as that way everything can trickle downstream to StackedGit. Thanks again.
Will do.
This PR allows the user to control the file mode set for newly created
NameTempfile
instances.This is paramount for when tempfiles are used as scratch space that will be persisted and renamed into the final destination once all writes are completed. This is an effective way to prevent half-written files on disk and to assure that readers can only observe the final result of a change.
This paradigm is implemented in the
gix-lock
crate which is used by thegix-ref
crate for editing Git references. Currently these are created with0o600
permissions, which can cause issues when usingsudo
in downstream applications.When this PR is merged,
gix-ref
can be adjusted to allow setting the mode just like Git does.Review Notes
I did my best to adapt the least amount of code while staying true to the current organisation. Initially I named the new method
mode
, but realized thatstd::fs::Permissions
exists which would allow the API to be more general, despite only one available platform for implementation.Let me explain the commits in some detail and why it's one more than needed:
Builder
method without an implementation. CI is red to show it's not working.imp
. CI is green.permissions
can be passed down to the platforms, which is more invasive, yet cleaner. CI is also green.Please let me know which way you prefer as its also a trade-off.