Stebalien / tempfile

Temporary file library for rust
http://stebalien.com/projects/tempfile-rs
Apache License 2.0
1.2k stars 120 forks source link

feat: Add `Builder::permissions()` method. #273

Closed Byron closed 9 months ago

Byron commented 9 months ago

This PR allows the user to control the file mode set for newly created NameTempfile instances.

This is paramount for when tempfiles are used as scratch space that will be persisted and renamed into the final destination once all writes are completed. This is an effective way to prevent half-written files on disk and to assure that readers can only observe the final result of a change.

This paradigm is implemented in the gix-lock crate which is used by the gix-ref crate for editing Git references. Currently these are created with 0o600 permissions, which can cause issues when using sudo in downstream applications.

When this PR is merged, gix-ref can be adjusted to allow setting the mode just like Git does.

Review Notes

I did my best to adapt the least amount of code while staying true to the current organisation. Initially I named the new method mode, but realized that std::fs::Permissions exists which would allow the API to be more general, despite only one available platform for implementation.

Let me explain the commits in some detail and why it's one more than needed:

Please let me know which way you prefer as its also a trade-off.

NobodyXu commented 9 months ago

@Byron There's a better way to create tempfile that persists after writing is done on Linux using O_TMPFILE.

While it only supports ext2, ext3, ext4, UDF, Minix, tmpfs XFS (Linux 3.15), Btrfs (Linux 3.16), F2FS (Linux 3.16), ubifs (Linux 4.9), and needs either CAP_DAC_READ_SEARCH or a procfs to be mounted, it's still a pretty good solution.

It prevents any other process from manipulating the file (except through procfs) until the caller process marked it as ready, and it removed the file on process exit even if the rust destructor is not called (e.g. panic during unwinding, calling exit directly)

O_TMPFILE (since Linux 3.11)
              Create an unnamed temporary regular file.  The pathname
              argument specifies a directory; an unnamed inode will be
              created in that directory's filesystem.  Anything written
              to the resulting file will be lost when the last file
              descriptor is closed, unless the file is given a name.

              O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
              and, optionally, O_EXCL.  If O_EXCL is not specified, then
              linkat(2) can be used to link the temporary file into the
              filesystem, making it permanent, using code like the
              following:

                  char path[PATH_MAX];
                  fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
                                          S_IRUSR | S_IWUSR);

                  /* File I/O on 'fd'... */

                  linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH);

                  /* If the caller doesn't have the CAP_DAC_READ_SEARCH
                     capability (needed to use AT_EMPTY_PATH with linkat(2)),
                     and there is a proc(5) filesystem mounted, then the
                     linkat(2) call above can be replaced with:

                  snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
                  linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
                                          AT_SYMLINK_FOLLOW);
                  */

              In this case, the open() mode argument determines the file
              permission mode, as with O_CREAT.

              Specifying O_EXCL in conjunction with O_TMPFILE prevents a
              temporary file from being linked into the filesystem in
              the above manner.  (Note that the meaning of O_EXCL in
              this case is different from the meaning of O_EXCL
              otherwise.)

              There are two main use cases for O_TMPFILE:

              •  Improved tmpfile(3) functionality: race-free creation
                 of temporary files that (1) are automatically deleted
                 when closed; (2) can never be reached via any pathname;
                 (3) are not subject to symlink attacks; and (4) do not
                 require the caller to devise unique names.

              •  Creating a file that is initially invisible, which is
                 then populated with data and adjusted to have
                 appropriate filesystem attributes (fchown(2),
                 fchmod(2), fsetxattr(2), etc.)  before being atomically
                 linked into the filesystem in a fully formed state
                 (using linkat(2) as described above).

              O_TMPFILE requires support by the underlying filesystem;
              only a subset of Linux filesystems provide that support.
              In the initial implementation, support was provided in the
              ext2, ext3, ext4, UDF, Minix, and tmpfs filesystems.
              Support for other filesystems has subsequently been added
              as follows: XFS (Linux 3.15); Btrfs (Linux 3.16); F2FS
              (Linux 3.16); and ubifs (Linux 4.9)
Stebalien commented 9 months ago

@NobodyXu I agree if you're only target is Linux, but it's still useful to have a platform-independent way to do this (to, e.g., support MacOS which hasn't seen any filesystem API improvements in, basically ever).

Stebalien commented 9 months ago

Thanks! And yes, I prefer the final version (more invasive approach).

Byron commented 9 months ago

Thanks for the quick review! If CI passes I have addressed the windows portion of it, but will definitely need some guidance to deal with the directory. If you have ideas and want to tinker, please feel free to push into this branch - sometimes that's easiest.

@NobodyXu Thanks for sharing! It's definitely something to consider for the time when performance needs to be optimised further. Personally I am most interested in doing what Git does at first, which seems to be pretty much what's happening here.

Byron commented 9 months ago

I have added mode-support for directories thanks to your hint with rustix::fs::mkdir and it appears to work. Further, trying to change permissions on unsupported platforms will result in an error that as well. When CI is green it's ready for review, with the remark that I kept dir.rs nimble, with a structure sufficient for the task even though it's not quite the same as in file.

Stebalien commented 9 months ago

So... somehow I missed this but: https://doc.rust-lang.org/std/fs/struct.DirBuilder.html#method.mode. We don't need to use rustix here.

Byron commented 9 months ago

So... somehow I missed this but: https://doc.rust-lang.org/std/fs/struct.DirBuilder.html#method.mode. We don't need to use rustix here.

Oh my, I missed that too! I clearly was biased when researching this topic and "didn't believe in mode being available for directories" 🤦‍♂️. Lesson learned, and the code was adjusted to something much better.

I made a note that recommends to not recursively create the directory by default but leave that to a separate PR along with a new flag in the builder maybe to control this, to keep the previous behaviour exactly the same. Right now, this still is the case.

If CI is green, this PR should be ready for re-review.

Stebalien commented 9 months ago

And with those changes, this should be good to go. Thanks!

Byron commented 9 months ago

Thanks a lot for the QA and sorry for the sloppiness, at least 2 of 3 I should have caught. But here we are, and I think that's the version that is ready to go.

PS: A new release with these changes would help, as that way everything can trickle downstream to StackedGit. Thanks again.

Stebalien commented 9 months ago

Thanks a lot for the QA and sorry for the sloppiness, at least 2 of 3 I should have caught.

Don't be. That'll just make me feel worse about my PRs and that's why we have review 😆.

PS: A new release with these changes would help, as that way everything can trickle downstream to StackedGit. Thanks again.

Will do.