rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
652 stars 57 forks source link

What guarantees are provided about whether the reference/stdlib docs are valid on target architectures? #461

Open joshlf opened 12 months ago

joshlf commented 12 months ago

The following comment appears in the implementation of mem::swap:

// NOTE(eddyb) SPIR-V's Logical addressing model doesn't allow for arbitrary
// reinterpretation of values as (chunkable) byte arrays, and the loop in the
// block optimization in `swap_slice` is hard to rewrite back
// into the (unoptimized) direct swapping implementation, so we disable it.
// FIXME(eddyb) the block optimization also prevents MIR optimizations from
// understanding `mem::replace`, `Option::take`, etc. - a better overall
// solution might be to make `ptr::swap_nonoverlapping` into an intrinsic, which
// a backend can choose to implement using the block optimization, or not.

This suggests that some of the guarantees provided by the reference/stdlib docs are not valid on the spirv target architecture.

This raises a question: What guarantees are provided about which architectures the reference/stdlib docs are valid for? Is there a list, or is support guaranteed for targets in a particular tier, etc?

The context is that, for zerocopy, we'd like to be able to reason about this list of architectures so that we can guarantee that our code is sound as long as it compiles (https://github.com/google/zerocopy/issues/383).

RalfJung commented 12 months ago

Yeah that sounds like Spir-V is a non-conforming target, if one cannot turn arbitrary values into MaybeUninit<u8> arrays. The only other target of that sort I am aware of is CHERI.

Both of these are out-of-tree targets, I think. At least I cannot find spirv on https://doc.rust-lang.org/rustc/platform-support.html. Obviously we make no guarantees for out-of-tree targets. If they ever become tier 3 (or higher) targets, the target support page should list such caveats.

bjorn3 commented 12 months ago

@eddyb how much of an issue is this still with your SPIR-T framework?

joshlf commented 12 months ago

Yeah that sounds like Spir-V is a non-conforming target, if one cannot turn arbitrary values into MaybeUninit<u8> arrays. The only other target of that sort I am aware of is CHERI.

Both of these are out-of-tree targets, I think. At least I cannot find spirv on https://doc.rust-lang.org/rustc/platform-support.html. Obviously we make no guarantees for out-of-tree targets. If they ever become tier 3 (or higher) targets, the target support page should list such caveats.

Okay, that makes sense. This may seem like a silly or pedantic question, but is it documented anywhere what guarantees are made about the reference/stdlib docs specifically for officially supported platforms in different tiers?

This question might be better motivated with more context: In zerocopy, we're working towards a guarantee something to the effect of if your (safe) code builds with zerocopy, it will be sound today and under any future version of the compiler. This might help explain why you've seen me trying to get a lot of little nit-picky things documented in the reference/stdlib docs recently - one part of upholding this guarantee is making sure that all of our soundness proofs are based only on facts about the language which aren't subject to change.

When we came across that comment about SPIR-V, it made us realize that we might need to reason about target architecture in order to genuinely uphold that guarantee. Our initial thought was that we should explicitly fail compilation (e.g. via compile_error!) on any target architecture whose conformance with the reference/stdlib docs we hadn't explicitly verified. Then we realized that it was probably some kind of bug if a target was supported but, for that target, the reference/stdlib docs were just wrong.

So what we're trying to do now is answer a semi-technical, semi-social question: What would be a reasonable policy to adopt with regards to when we guarantee soundness as a function of target architecture? Some potential answers we could imagine:

Sorry for the wall of text; I hope that's useful context for this question.

Lokathor commented 12 months ago

This may seem like a silly or pedantic question, but is it documented anywhere what guarantees are made about the reference/stdlib docs specifically for officially supported platforms in different tiers?

Stuff that's out of our hands is out of our hands. We can't go around disclaiming everything that might need a disclaimer. Users should stick to only what we've affirmed if they want to be as pedantic as possible.

The only reasonable path of the ones you suggested seems to be bullet point 3. Even that doesn't actually prevent, using this scenario as an example, someone from patching SPIR V into their compiler and then things go wrong. Your crate would likely build and then crash only at runtime. All the standard problems with runtime UB, etc etc

saethlin commented 12 months ago
  • We treat Tier 1 targets as having an implicit guarantee that the reference/stdlib docs are valid for such targets. If this isn't upheld, we consider it a bug on the part of Rust.

i686-pc-windows-msvc is an unsound Tier 1 target because MSVC disagrees with itself on the alignment of u64 (alignas(uint64_t) aligns to 8, but MSVC allocates uint64_t values with alignment of 4) and LLVM assumes uint64_t has alignment 8 on that target.

I do not think the relevant teams are willing to take the same hard stance you are on soundness, and/or the Tier 1 status is not an indication of soundness. So in addition to everything else you've said, I think it would be unwise to key anything on the target tier.

But even if you go for option 3:

We guarantee that we are sound on any target for which the reference/stdlib docs are valid.

you still have a problem with i686-pc-windows-msvc. You either need to say that the stdlib docs are not valid (std::mem::align_of::<u64>() returns 8 on that target which is not the alignment according to MSVC) or you need to say that u64 and uint64_t are not ABI-compatible on i686-pc-windows-msvc.

I agree that from a social perspective option 3 is probably the best, my only concern with it is that I don't think we have somewhere that we document which targets are unsound or that the docs don't apply to. I agree that you should be able to punt this whole question to the Rust project which should document its own target errata, but the reality is that there is no organized effort to document them right now. There should be though.

bjorn3 commented 12 months ago

In case of SPIRV it will either happen to work or the driver or compiler backend will reject it due to not being valid SPIRV. No runtime misbehavior can happen unless the backend produced invalid SPIRV, you don't validate the SPIRV and the driver doesn't validate either. These three things happening at once can only happen in case of compiler bugs.

joshlf commented 12 months ago

Wow, glad I asked! I love learning about all of the ways that reality is always more complex than you imagined. Based on this, I'm now leaning towards something more like "Zerocopy bases its soundness on the assumption that the guarantees documented in the Rust reference and standard library documentation hold. Note that there may be cases in which they don't hold; in such cases, zerocopy may be unsound." Maybe along with a link to this thread to explain why it's hard 😛

i686-pc-windows-msvc is an unsound Tier 1 target because MSVC disagrees with itself on the alignment of u64 (alignas(uint64_t) aligns to 8, but MSVC allocates uint64_t values with alignment of 4) and LLVM assumes uint64_t has alignment 8 on that target.

Out of curiosity, does this cause problems for real programs in the wild? A target like this seems important enough that I assume folks have encountered this issue and had to figure out workarounds; do you know of examples of such workarounds?

saethlin commented 12 months ago

It is unclear if this causes problems in the wild. It was independently discovered a few times, so who knows what's behind the other reports. I suspect alignment in languages other than Rust is only used to deduce which instructions should be used to load from memory, and on x86 whether your alignment is 4 or 8 doesn't matter.

The sad reality for that target is that since people need to use old dylibs compiled with MSVC, if any issue surfaces it will probably be dealt with by nerfing optimizations until it goes away. At some point Rust will start exploiting the alignment niche in references, and we'll probably have a target-specific hack for i686-pc-windows-msvc to not fully exploit the niche.

RalfJung commented 12 months ago

my only concern with it is that I don't think we have somewhere that we document which targets are unsound or that the docs don't apply to

Yeah, we should have "target errata" or something like that. https://github.com/rust-lang/rust/pull/113053 is a first step in that direction; once https://github.com/rust-lang/rust/pull/116004 lands we might have a page per target which then makes this easier than having to attach increasing numbers of footnotes to the rows in that table.

Generally I would say any target deviating from reference/std guarantees is a critical bug for that target. The bug might just be near impossible to fix for some targets...

And yeah i686-pc-windows-msvc is particularly bad. We should at least stop emitting align attributes for that target. I hope one day someone will care enough about that target to implement mitigations for https://github.com/rust-lang/rust/issues/112480.

eddyb commented 11 months ago

Sorry for not replying to this sooner (feel free to poke me directly, as well, if you have questions about Rust-GPU, or in general anything suspicious with my name randomly stuck to it - tho in this case this issue was probably warranted to some extent, I was just slow to get to it).

First off, in the process of explaining this, I ran into some weird historical interactions:


The summary is that both mem::replace and mem::swap have worked for a while now, on Rust-GPU:


There are two parts to my comment, and I want to address the second half first:

// FIXME(eddyb) the block optimization also prevents MIR optimizations from
// understanding `mem::replace`, `Option::take`, etc. - a better overall
// solution might be to make `ptr::swap_nonoverlapping` into an intrinsic, which
// a backend can choose to implement using the block optimization, or not.

That can be removed now: mem::replace doesn't use mem::swap anymore, and can be optimized on its own (only remaining concern might be backends potentially using target information they have to further optimize the loop, but it's probably a stretch and the core impl is likely fine for LLVM and GCC).


Only the first half is pertinent to the Rust-GPU SPIR-V backend:

// NOTE(eddyb) SPIR-V's Logical addressing model doesn't allow for arbitrary
// reinterpretation of values as (chunkable) byte arrays, and the loop in the
// block optimization in `swap_slice` is hard to rewrite back
// into the (unoptimized) direct swapping implementation, so we disable it.

This is indeed a fundamental limitation of SPIR-V, as it requires typed memory (about as much as compiling to shader language source code would require, or e.g. a version of C without unions or pointers). In a sense, the Rust-GPU rustc SPIR-V backend primarily exists to reconcile Rust's semantics (untyped memory, address-space-unaware pointers/references, unstructured control-flow etc.) with the target's limitations.

This suggests that some of the guarantees provided by the reference/stdlib docs are not valid on the spirv target architecture. ... The context is that, for zerocopy, we'd like to be able to reason about this list of architectures so that we can guarantee that our code is sound as long as it compiles

Just to make sure I understand, is the concern that "guarantees not provided" can lead to "code compiles and misbehaves"? I do not fully understand how that could work, what behaviors could even be assigned?

As it stands in Rust-GPU today, failure to legalize (e.g. Rust untyped memory usage, into equivalent typed memory, or to infer an address space, etc.) leads to a build error: the input Rust code is unsupported.

In theory, though, it is possible to emulate untyped memory on top of typed memory, even fully dynamic fn pointers (AFAIK there are non-Rust projects doing something like this already), so in the future, most Rust code might compile in Rust-GPU (just with a lot of e.g. warnings about very inefficient emulation).


The only other target of that sort I am aware of is CHERI.

Just wanted to quickly mention that CHERI is still mostly free-form wrt mixing pointers and data dynamically, the main limitation is pointers having to be aligned to e.g. a multiple of 16 bytes (to minimize the "129th bit" validity bitset storage space). But SPIR-V largely doesn't even let you "store pointers" in memory types (and in fact, its "logical pointers" are more like "placenames", they only have them because they copied how LLVM uses pointers to refer to memory, instead of having some separate "lvalue"/"place" concept).


In case of SPIRV it will either happen to work or the driver or compiler backend will reject it due to not being valid SPIRV. No runtime misbehavior can happen unless the backend produced invalid SPIRV, you don't validate the SPIRV and the driver doesn't validate either. These three things happening at once can only happen in case of compiler bugs.

Ah yes, I see, @bjorn3 already made my point for me (I'm going through the thread chronologically).

It does seem the discussion also shifted to i686-pc-windows-msvc (and that's a much scarier example).

If someone still is curious about what my plans for addressing the "untyped memory (Rust) vs typed memory (SPIR-V)" semantic mismatch to a broader extent, you can read more about it under the qptr heading of Rust-GPU 0.7 release and this PR:

But it's mostly low-level IR ramblings (and a lot of them GPU-specific), so unlikely to be that informative.

RalfJung commented 11 months ago

Thanks! I've filed https://github.com/rust-lang/rust/pull/116329 for the comment removal you suggested, also adding a note to swap_simple that this benefits SPIR-V.