rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.1k stars 12.26k forks source link

Tracking Issue for RFC 3128: I/O Safety #87074

Closed joshtriplett closed 2 years ago

joshtriplett commented 3 years ago

Feature gate: #![feature(io_safety)]

This is a tracking issue for RFC 3128: I/O Safety.

Raw OS handles such as RawFd and RawHandle have hazards similar to raw pointers; they may be bogus or may dangle, leading to broken encapsulation boundaries and code whose behavior is impossible to bound in general.

Introduce a concept of I/O safety, and introduce a new set of types and traits, led by OwnedFd and BorrowedFd, to support it.

Public API

The public API on UNIX platforms consists of the types OwnedFd and BorrowedFd, the trait AsFd, and implementations of AsFd, Into<OwnedFd>, and From<OwnedFd> for various types (such as files and sockets).

The public API on Windows platforms consists of two sets of parallel types and traits and impls for OwnedHandle, OwnedSocket, BorrowedHandle, BorrowedSocket, etc.

Steps / History

Unresolved Questions

sunfishcode commented 2 years ago

I've now posted an initial PR implementing the proposal in the RFC: https://github.com/rust-lang/rust/pull/87329

sunfishcode commented 2 years ago

The implementation PR is now merged!

sunfishcode commented 2 years ago

Examples of using the I/O safety types and traits in Rust Nightly in real-world codebases, with all tests passing:

jstarks commented 2 years ago

Hi, I'm on the Windows team at Microsoft. Perhaps this deserves a separate issue, but posting here for now...

OwnedHandle claims that INVALID_HANDLE_VALUE is a valid value:

Note that it may have the value INVALID_HANDLE_VALUE (-1), which is sometimes a valid handle value. See [here] for the full story. For APIs like CreateFileW which report errors with INVALID_HANDLE_VALUE instead of null, use [HandleOrInvalid] instead of Option<OwnedHandle>.

But this doesn't make sense for OwnedHandle. It's true that INVALID_HANDLE_VALUE can sometimes act as a valid handle for inputs. Specifically, the same value, -1, is used for the current process pseudo handle, which you can pass to functions such as DuplicateHandle and (less usefully) WaitForSingleObject. But it's never the case that INVALID_HANDLE_VALUE is a valid owned handle--you can never legally call CloseHandle on it.

In fact, in user mode (which is all we care about for this target), the only valid owned handle values are > 0--negative handle values are reserved for kernel handles (which are inaccessible to user mode) and pseudo handles, neither of which can be closed.

So the documentation should be updated. But I also think the use of OwnedHandle will be less error prone if you assert (in from_raw_handle and similar) that the value is > 0 instead of != null, so that users do not accidentally store INVALID_HANDLE_VALUE or a pseudo handle in it.

This also gives you a much bigger niche if you choose to use it. It would be reasonable to stick with NonNull for now and optimize as a second step, even after stabilization. But you can't strengthen the assertions after stabilization.

BorrowedHandle, OTOH, is correct, since it would be reasonable to store a pseudo handle in a BorrowedHandle<'static>.

jstarks commented 2 years ago

Related question: does HandleOrInvalid belong in std? I like the type--it's a clever approach for dealing with CreateFileW and related APIs that return INVALID_HANDLE_VALUE on error. But I'm not sure that it really needs to be in std, since it really only needs to be used locally at FFI edges before converting to OwnedHandle. I would be very surprised to see it in a function signature in any public Rust API.

Just my 2c.

sunfishcode commented 2 years ago

That makes sense, however there's another dimension: it's a somewhat established convention to use ManuallyDrop<File> and similar to construct temporary views to otherwise borrowed handles—with some care, the idiom can be done reasonably safely. In those situations, an OwnedHandle can hold a pseudo handle.

One thing we could easily do though would be to assert!(handle > 0) just before libstd's call to CloseHandle. Does that sound like a worthwhile change?

While you're here :smile: would you be able to authoritatively state that NULL is never returned, as an error or as a valid handle, from functions that use INVALID_HANDLE_VALUE? The Windows documentation isn't clear on this point.

Related question: does HandleOrInvalid belong in std? I like the type--it's a clever approach for dealing with CreateFileW and related APIs that return INVALID_HANDLE_VALUE on error. But I'm not sure that it really needs to be in std, since it really only needs to be used locally at FFI edges before converting to OwnedHandle. I would be very surprised to see it in a function signature in any public Rust API.

Indeed, HandleOrInvalid is limited to that exact use case. It's of course subjective whether HandleOrInvalid should be in std; I favor including it because on the Unix side, OwnedFd and BorrowedFd support this kind of FFI use case, and HandleOrInvalid allows std to provide a roughly similar level of functionality for Windows.

jstarks commented 2 years ago

That makes sense, however there's another dimension: it's a somewhat established convention to use ManuallyDrop and similar to construct temporary views to otherwise borrowed handles—with some care, the idiom can be done reasonably safely. In those situations, an OwnedHandle can hold a pseudo handle.

Ah, interesting. In the case of File, there is (currently?) no pseudo handle that would be usable with any file API. But that could change in a future Windows release, I suppose. And this pattern might get used for other handle types in the future, I guess.

So that rules out the expanded niche. You could still assert, and require transmute instead of from_raw_handle for pseudo handles. But perhaps it's not worth it, since the pattern is confusing enough as it is.

On that note, I wonder if as part of the IO safety work, a new type BorrowedFile<'a> that implements Deref<File> might be interesting. It could wrap the existing ManuallyDrop pattern and be constructable from BorrowedFd<'a>. I guess this would be like SockRef, but for File.

One thing we could easily do though would be to assert!(handle > 0) just before libstd's call to CloseHandle. Does that sound like a worthwhile change?

If possible, a better change would be to assert on CloseHandle's success, I think. In general, failure to close a handle may be the result of a double close, which depending on race conditions could lead to applying the wrong operation to the wrong handle, which could definitely be a memory safety issue. Better to catch such issues as soon as possible.

Of course, it's possible there are Rust programs out there that have benign versions of this bug, in which case this might be a regression for them.

While you're here 😄 would you be able to authoritatively state that NULL is never returned, as an error or as a valid handle, from functions that use INVALID_HANDLE_VALUE? The Windows documentation isn't clear on this point.

NULL is never a valid handle. The mainstream APIs that are documented to return INVALID_HANDLE_VALUE on error (e.g. CreateFileW) will never return NULL. The Win32 API surface is huge, though, so it's certainly possible that there's some obscure Win32 API out that returns INVALID_HANDLE_VALUE in some error conditions and NULL in others (if so, this would likely be due to a ~bug~ feature). I would not suggest optimizing for this case, though.

(Of course, here I'm only referring to handles that are actually backed by kernel handles, i.e. handles that you would pass to CloseHandle).

sunfishcode commented 2 years ago

On that note, I wonder if as part of the IO safety work, a new type BorrowedFile<'a> that implements Deref<File> might be interesting. It could wrap the existing ManuallyDrop pattern and be constructable from BorrowedFd<'a>. I guess this would be like SockRef, but for File.

The io-lifetimes crate has something like this here. The feedback from the RFC was to move just the basics into std for now.

If possible, a better change would be to assert on CloseHandle's success, I think. In general, failure to close a handle may be the result of a double close, which depending on race conditions could lead to applying the wrong operation to the wrong handle, which could definitely be a memory safety issue. Better to catch such issues as soon as possible.

Of course, it's possible there are Rust programs out there that have benign versions of this bug, in which case this might be a regression for them.

That's a good point, and Rust has had this behavior since the beginning, so it may not be simple to add these asserts.

sunfishcode commented 2 years ago

io-lifetimes 0.3.0 has an option to use OwnedFd/BorrowedFd/AsFd (and similar on Windows) from Rust Nightly instead of its own.

In addition to the testing above, I've now tested the io_safety feature in two more real-world codebases, on Windows and Linux, and the testsuites pass:

(For cap-std, I excluded cap-async-std, because Rust's orphan rule prevents io-lifetimes from providing implementations for async-std's types for traits that in this mode are defined in std.)

SoniEx2 commented 2 years ago

How badly does this RFC break with /dev/fd/ being fully safe?

sunfishcode commented 2 years ago

It doesn't break the RFC; see the discussion of /proc/self/fd in the RFC, which also applies to /dev/fd, as well as the discussion on internals.rust-lang.org.

sunfishcode commented 2 years ago

There is a report of real-world code breakage here.

sunfishcode commented 2 years ago

I've now filed https://github.com/rust-lang/rust/issues/89955 to track the breakage issue.

sunfishcode commented 2 years ago

Who's tried out the I/O safety APIs? Either in nightly here, or in the io-lifetimes crate. If anyone has tried them out, it'd be great to hear from you!

yoshuawuyts commented 2 years ago

@sunfishcode if you're potentially looking for more projects to test the io-lifetimes design on, perhaps the miow crate could be interesting. There are two types of pipes, and an iocp instance which expose AsRawHandle that could probably be rewritten to use io lifetimes.

Another interesting one to potentially look at porting is async-io; it makes heavy use of raw io references in it's API calls, and the Async type might make for a good stress test.

sunfishcode commented 2 years ago

Thanks for the suggestions! My current expectation is that I've ported enough code to io-lifetimes to be reasonably confident that it works as I expect, including cap-async-std and a tide demo that uses it. I'm just curious if anyone else has tried it.

mllken commented 2 years ago

I sometimes use std::process::Command's stdin() and stdout() functions to pass a single descriptor to a child process's stdin and stdout, such as how an inetd-like program would. With the old API, I could (unsafely) pass a single RawFd into .stdin() and .stdout() using Stdio::from. Is there a way to safely do this with the new API? I suppose I could create two OwnedFds from a single descriptor via a clone and pass each into .stdin() and .stdout(), but that would cost an extra descriptor.

sunfishcode commented 2 years ago

Unfortunately there isn't a safe way to do that yet, without duplicating the file descriptor, as you say. It's a good idea, and something we should look into once the fundamentals here are in place.

ijackson commented 2 years ago

I sometimes use std::process::Command's stdin() and stdout() functions to pass a single descriptor to a child process's stdin and stdout, such as how an inetd-like program would. With the old API, I could (unsafely) pass a single RawFd into .stdin() and .stdout() using Stdio::from. Is there a way to safely do this with the new API?

I think for this to work Command::stdin() ought to let you pass a BorrowedFd. But Command doesn't have a lifetime. This is a real problem with the API structure: in order to dup the fd after fork, the fd must live until fork.

So Command actually does borrow the fds you feed to stdin() etc. But because Command is 'static, all the fds you feed it must be 'static too. Which is why you can pass stdin/out/err (albeit via an API special-case), but only owned forms of your own descriptors.

I think the only solution is to add a lifetime to Command, or to add some kind of special-case set of bodge calls lets you apply one owned object for several child fds. And even the latter is no good if you wanted to hold onto the fd yourself too for some reason.

Adding a lifetime to Command couldn't be done without replacing it with a new type with a new name. But there are other reasons we might want to do that. For example, Command::output is a footgun (#73126) and ought to be reworked but it has the best name. There are other balkinesses and strangenesses.

On the other hand, doing the pre-fork dup wastes O(1) descriptors, and O(1) calls to dup during child process creation. The costs of that are likely to be lost in the noise.

ijackson commented 2 years ago

In other news: what is the status of an OwnedFd containing 0, 1 or 2:

a. Wrong? In which case we should panic in the constructor when this happens. Unfortunately this means that the whole of std cannot use OwnedFd because people have been playing funky tricks with ManuallyDrop<File> for stdin, and that's baked into programs and de-facto part of the API; see also #30490 where the reporter passed FromRawFd(0) to Command::stdin() and (IMO unfortunately) the MR to make this "work" was accepted.

b. Treated as BorrowedFd,. In this case every user will have to do something special with stdio descriptors (like std::process does in fact do). Really, we want to avoid more of this.

I'm not sure what the right answer is. This is kind of in my way in #88561 but I am going to try to sidestep it.

sunfishcode commented 2 years ago

In other news: what is the status of an OwnedFd containing 0, 1 or 2:

My current interpretation of OwnedFd of 0/1/2 is that it's unsafe and tricky to use, but valid.

It needs FromRawFd::from_raw_fd or equivalent to construct, which is unsafe. As with any raw file descriptor, using 0/1/2 implies knowing that they are open, and, if the OwnedFd is ever dropped, knowing that it's ok to close them. In effect, 0/1/2 aren't special for OwnedFd; the only thing that makes them special are the fact that they're guaranteed to be open when the program starts, and that std::io::stdin/stdout/stderr, Command::stdin/stdout/stderr, and similar, are hard-wired to use these numbers.

In #30490, Command::stderr takes ownership when you pass it FromRawFd::from_raw_fd(1), and it closes 1 in the parent when the Command is dropped. That's arguably surprising and error-prone, but it took an unsafe block to call FromRawFd::from_raw_fd in the first place, so the programmer is assumed to be responsible for the situation. In all, I think the basic concepts work out here without special-casing 0/1/2.

For #88561, one idea would be to add a new dup function to Stdio which takes a AsFd argument and immediately does a dup on it, so that Command holds an independently-owned file descriptor without claiming ownership of the argument. That would be a pretty simple approach that would let you achieve the intended results, though it would be more syscalls than are theoretically needed.

Another idea would be to add new ParentStdin/ParentStdout/ParentStderr fields to the Stdio enum, and add new functions to the public Stdio API for them. That's special-casing, though it'd be scoped to the Command API.

I also agree that there are interesting possibilities if we designed a new Command API.

ChrisDenton commented 2 years ago

in the above linked issue I argue that, on Windows, AsHandle should not be directly implemented for stdio types because getting the handle can fail. I'd welcome any feedback.

sunfishcode commented 2 years ago

Here's a quick status of the I/O safety feature. There are several major changes currently outstanding before we can consider stabilization.

I just filed #93869, which is a major change to the AsFd trait. This will likely take a while, as I've only recently discovered that we can do what this PR does, and there are several subtle implications. Subsumed by #93888, which has now landed.

93263 is my proposal to address #90964, which changes Rust's behavior on Windows in certain situations. It's not uniformly ideal, but I believe it produces a reasonable API. The PR is currently waiting for an FCP.

93354 is my proposal to address #88564. It's unfortunate that the blanket ToOwned for Clone means that every type that implements Clone gets a to_owned(), but I think that's less awkward than having BorrowedFd not implement Copy.

93562 is my proposal to address #72175. I believe this is consistent with the spirit of the I/O safety RFC, so hopefully this should be straightforward.

93663 is a simple rename of a function to be slightly more convenient and have one less place where the API diverges between Windows and Unix.

90809 is a feature request for a new standard library function; it's worth considering, but it's not required for the feature in general.

SoniEx2 commented 2 years ago

So uh can we make Stdin etc Drop, reference-counted, and (delayed) closeable, with a panicking std::io::stdin() etc?

sunfishcode commented 2 years ago

@SoniEx2 Please file a new issue to propose this, and include a description of what use cases would want it and how it would work in practice.

SoniEx2 commented 2 years ago

honestly we just want a way to close stdin/etc because descriptors are permissions (on freebsd at least), and usually you're supposed to drop those.

sunfishcode commented 2 years ago

Please stop cluttering this tracking issue.

yoshuawuyts commented 2 years ago

Something we should probably track here is that we don't yet have a solution for async Windows HANDLE types. A docs change has been proposed in https://github.com/rust-lang/rust/pull/93562 to mention that that raw handles be not opened with the FILE_FLAG_OVERLAPPED flag. But we don't yet have a solution for what to do for handle types which have been opened with FILE_FLAG_OVERLAPPED.

This should likely follow a different stabilization path than the types added in the I/O safety RFC, as adding this would interact with the "async overloading" initiative, and we don't want to block I/O safety on that. But we should probably mention somewhere that this is something to look at in the future.

joshtriplett commented 2 years ago

Thanks to @sunfishcode for a detailed review of the issues! All of them but the last feature request are on track now; when those PRs finish getting merged, I'd love to see this proposed for stabilization with a stabilization report.

sunfishcode commented 2 years ago

Preparing for stabilization, I was porting some of io-lifetimes' tests into the rust tree, and I noticed that io-lifetimes is still using allow(improper_ctypes), which I had added to work around errors in the FFI tests. The proper fix is to make rustc consider Option<OwnedFd> FFI-safe. I've now submitted https://github.com/rust-lang/rust/pull/94586 to implement the rustc half of this. If I understand the process, the libstd half will need to wait until https://github.com/rust-lang/rust/pull/94586 is in beta.

The libstd change will just be to add #[rustc_nonnull_optimization_guaranteed] to OwnedFd etc., which is a backwards-compatible change, so it wouldn't necessarily need to happen before stabilization.

sunfishcode commented 2 years ago

All the PRs mentioned above are now merged. There are more features that would be useful to add, but the core types and traits here are ready for stabilization.

Request for Stabilization

I'd like to stabilize I/O safety. A stabilization PR can be found at #95118.

Summary

The following features will be stabilized:

Documentation

Tests

Within the Rust tree:

Outside the Rust tree, the feature has also been tested in:

Unresolved questions

ChrisDenton commented 2 years ago

@kennykerr would stabilizing of these handle types be useful to the windows-rs crate?

kennykerr commented 2 years ago

At a glance, probably not directly. The windows and windows-sys crates support all of the different handle types whereas this proposal appears to only focus on two.

sunfishcode commented 2 years ago

It is possible to use the new types in FFI declarations. But my understanding is that windows and windows-sys are auto-generated from an API description, so we'd need three additional pieces of information to be added to that description: (a) which handles are in the CloseHandle family, (b) which handles are owned or borrowed, and (c), for return values, whether errors are indicated via INVALID_HANDLE_VALUE or NULL.

ChrisDenton commented 2 years ago

Not to stray too offtopic but the win32metadata is quite rich.

win32metadata E.g. `CreateFileW` has the following metadata: ```csharp [DllImport("KERNEL32", ExactSpelling = true, PreserveSig = false, SetLastError = true)] [SupportedOSPlatform("windows5.1.2600")] public unsafe static extern HANDLE CreateFileW([In][Const] PWSTR lpFileName, [In] FILE_ACCESS_FLAGS dwDesiredAccess, [In] FILE_SHARE_MODE dwShareMode, [Optional][In] SECURITY_ATTRIBUTES* lpSecurityAttributes, [In] FILE_CREATION_DISPOSITION dwCreationDisposition, [In] FILE_FLAGS_AND_ATTRIBUTES dwFlagsAndAttributes, [Optional][In] HANDLE hTemplateFile); ``` ```csharp [RAIIFree("CloseHandle")] [NativeTypedef] public struct HANDLE { public IntPtr Value; } ``` The `GetCurrentProcess` metadata has the `return: DoNotRelease` attribute: ```csharp [DllImport("KERNEL32", ExactSpelling = true, PreserveSig = false)] [SupportedOSPlatform("windows5.1.2600")] [return: DoNotRelease] public static extern HANDLE GetCurrentProcess(); ``` The only thing it currently appears to lack is whether or not `-1` is `INVALID_HANDLE_VALUE` or a valid pseudo handle, although in this case I guess it can be inferred that `GetCurrentProcess` is special because it does not have the `SetLastError = true` attribute.
m-ou-se commented 2 years ago

@rfcbot merge

See https://github.com/rust-lang/rust/issues/87074#issuecomment-1073069181

rfcbot commented 2 years ago

Team member @m-ou-se has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

dtolnay commented 2 years ago

@rfcbot concern OwnedHandle's TryFrom impls

These are gonna need a real error type i.e. one that has a std::error::Error impl.

https://github.com/rust-lang/rust/blob/ab0c2e18dceb7140626a158affb983ae81039bd0/library/std/src/os/windows/io/handle.rs#L144-L145

https://github.com/rust-lang/rust/blob/ab0c2e18dceb7140626a158affb983ae81039bd0/library/std/src/os/windows/io/handle.rs#L202-L203

jstarks commented 2 years ago

Since you point out the lack of proper error types, let me raise the broader question: should HandleOrNull or HandleOrInvalid really be stabilized? Since those types are really just useful at FFI boundaries (you'll want to immediately convert to Result<OwnedHandle, _> before doing anything with them), they seem like better candidates for something like the winapi or windows crates.

sunfishcode commented 2 years ago

@dtolnay I've now submitted https://github.com/rust-lang/rust/pull/96195 to define a real error type.

@jstarks The reasons for defining them in std are:

sunfishcode commented 2 years ago

@dtolnay https://github.com/rust-lang/rust/pull/96195 has now landed, giving the OwnedHandle TryFrom impls dedicated error types with std::error::Error impls.

sunfishcode commented 2 years ago

Current status: This PR is waiting for reviews so that it can enter an FCP. I've responded to all outstanding concerns raised here.

riverar commented 2 years ago

Apologies if this is the wrong forum for commentary, please do redirect me if needed.

In fact, in user mode (which is all we care about for this target), the only valid owned handle values are > 0--negative handle values are reserved for kernel handles (which are inaccessible to user mode) and pseudo handles, neither of which can be closed.

I don't think this is guaranteed; I'd be careful assuming such. Handles are just opaque integers, not pointers, and can be any value at any time. What these values mean is API impl. specific, and what constitutes a sentinel value is API/family specific.

Just throwing out there that win32metadata is working out what is and isn't valid on a per-handle-type basis. This is used by the windows crate, for example, and guides codegen decisions (e.g. is_invalid semantics).

ChrisDenton commented 2 years ago

Currently the standard library's concept of an OwnedHandle is one that can (assuming it's valid) be closed using CloseHandle. And a BorrowedHandle is the borrowed form of that, but also includes some "static" handles (aka pseudohandles) which should not be closed at all.

joshtriplett commented 2 years ago

@dtolnay https://github.com/rust-lang/rust/pull/96195 has landed; does that resolve your concerns?

sunfishcode commented 2 years ago

@riverar There is some more discussion of this in the comments replying the the comment you quoted, but the short answer is: the current OwnedHandle and BorrowedHandle do not assume anything about the handle value.

riverar commented 2 years ago

Well it seems to assume it can be passed into CloseHandle which is also not accurate. But I will admit, I don't have a full grasp of what subset of handles we're talking about here.

sunfishcode commented 2 years ago

@riverar There are limitations on what you can construct an OwnedHandle with. But OwnedHandle's itself has no special knowledge of "negative" handles.

ChrisDenton commented 2 years ago

OwnedHandle really shouldn't be constructed with anything that isn't documented as being closeable. But in practice it should be ok. Some functions such as GetCurrentProcess explicitly document that CloseHandle will ignore that particular psuedohandle but I don't know if that's guaranteed in the general case.

ChrisDenton commented 2 years ago

Just to summarise the story of Rust's Windows handles in win32 terms for people joining us,

A "handle" in Rust std (whether raw, owned or borrowed) is roughly equivalent to Win32::Foundation::HANDLE. The metadata for which is here.

However, these new "I/O safe" types aim to fulfil specific functions:

Conversations so far have, among other things, discussed: