rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
651 stars 57 forks source link

Must `let` variables (without `mut` or `UnsafeCell`) be read-only? #400

Open RalfJung opened 1 year ago

RalfJung commented 1 year ago

This came up in https://github.com/rust-lang/rust/issues/111502 and overlaps with https://github.com/rust-lang/unsafe-code-guidelines/issues/257: should it be allowed to mutate let-bound variables, via code like this?

let x = 0i32;
let ptr = addr_of!(x) as *mut i32;
*ptr = 42;

Tree Borrows currently accepts this code since addr_of never generates a new tag, it just creates an alias to the original pointer. Stacked Borrows rejects it due to #257. There are other ways to reject this code while making *const and *mut equivalent, hence I opened this as a separate issue.

Personally I favor the Tree Borrows behavior here and don't think it is worth the extra effort to make these variables read-only.

CAD97 commented 4 months ago

This is a bit of an essay, sorry. I marked the most important takeaway, with the TL;DR being that I'm in support of the OP example being defined behavior.

There is a weird semantic space where mutating a non-mut let binding would live — behavior considered defined by the compiler, but still heavily discouraged, borderline still disallowed in any reasonable code.

To @workingjubilee's point, as I understand it:

let place: i32 = 0;
macro_from_upstream_library!(place);
assert_eq!(place, 0);

Answering the issue title's question as no and declaring the OP's example as defined behavior would mean that this assertion could panic in valid executions. However, if it did, the macro would be unsound. Macros are absolutely allowed to expand to internally encapsulated unsafe, but the unsoundness of the macro which makes this assertion panic isn't even a difficult to spot technicality; the counterexample is as simple as just writing macro_from_upstream_library!(*&place).

When reasoning about what other code (outside the safety boundary, whatever you've locally defined that to be) can do, the question isn't about language validity and what is or isn't UB, it's about safety invariants, which are much stricter than simple validity.

(Aside: I think I want to write (another) blog post on the idea of and separation between the concepts of validity and safety. It comes up and I've re-explained it frequently enough that I'd like to have a stable reference that I can point to that covers just that difference specifically.)

I am treating the author of the macro as potentially adversarial to the author of the code.

There is some level of cooperation which you have to assume, and that cooperation is that upstream is sound. If you can't assume that, upstream could just unreachable_unchecked and eat your laundry for free. Or just write the mutation through addr_of! anyway despite it being invalid instead of just unsound.

The correct model that Rust helps with is explicitly not an adversarial upstream. An ignorant and/or negligent upstream, sure, but upstream is always presumed to be competent, cognizant, and cooperative enough to maintain the soundness of the interface.

In fact, allowing mutation helps with the case of negligent upstream, because now if they use addr_of! instead of addr_of_mut! incorrectly when the value could potentially be written to, the result is merely surprising behavior instead of deleting your code (or worse, seeming like it works). In the specific case where it gets run under Miri, the perfect result might be to point at the errant write. But since pointers in public API are involved, it's much more likely that we can't use Miri because there's FFI involved, and in that case I'd much prefer that my debugger works and shows me that the value changed, instead of presenting some arbitrarily wrong picture of the world that's been corrupted by UB.

That is the primary issue with UB — you can't reliably debug it when it happens. The main problem with how we're understanding your position is that you're apparently claiming that UB is preferable to a defined but semantically wrong result. This is a very confusing position to hold, because while you can debug a wrong result, you fundamentally can't debug a UB.

Perhaps the convenience of Miri as a sanitizer works against us here. UB is not equivalent to diagnosed by Miri. UB should always be thought of as perhaps the exact opposite — daemonic nondeterminism that chooses between time traveling and changing previous control flow choices, the most sensitive data accessible by the machine, the "correct" result, or a crash, whichever is worst for you at the moment, non-exhaustively.

"Forbid it, diagnose it as forbidden whenever possible, but as a mitigation permit it when it can't be prevented" is a defensible position. "Eat my laundry if I dare make a mistake" less so.


To the second form of the point mentioned both by Jubilee and Calvin, between

lib_call(&place); // or
lib_call(addr_of!(place)); // where
// extern fn lib_call(_: *const Type);

I would still recommend using &place and reference-to-pointer decay in most cases, and in all cases place is an initialized let without mut. And if you want to be explicit that conversion to pointer is happening, ptr::from_ref(&place). The entire point of choosing to "always prefer addr_of!" when creating a pointer is that addr_of! has weaker semantics and permits more operations as valid; if the semantics (and optimization) you want are that of a reference, use a reference.

If a developer chooses to use addr_of!(place) and not ptr::from_ref(&place), they're explicitly requesting weaker semantics that don't include freezing place from potential mutation. Entirely ignoring that signal because the place was declared with let — potentially because the compiler told them the mut was unnecessary — seems needlessly adversarial of the compiler.

In fairness, I absolutely could see Miri in a conservative mode (e.g. under cargo-careful) choosing to warn when let-bound places are mutated. It's the same reasoning I gave behind considering shallow memory initialization requirements for reference retagging — it's "nicer" to "blame" the cause rather than the symptom.

If we had a formal classification for "forbidden but still with predictably defined behavior of the 'obvious' semantics or a crash," it could make sense to place this there. Exposed provenance is mostly in that bucket, I believe. But it isn't necessary to formally have that bucket in order to diagnose it. Miri isn't UBSAN which (in theory) accepts all valid programs and rejects most invalid; Miri (in theory) rejects all invalid programs and accepts a majority of valid ones.

This might be "erroneous behavior." It might be "safe misbehavior." Maybe we should try to build such a catorization as we stabilize the strict provenance APIs and start moving towards supporting CHERI as an experimental target. But making it UB will never make Jubilee's million-line maze easier to manage, and can only make it more difficult to debug.

I will even agree that any code written that deliberately relies on this being DB should be immediately fixed with priority equal to that of latent UB. Code which relies on the "wide" (or "infectious") model for shared mutability instead of wrapping things in UnsafeCell, which relies on the !Unpin hack instead of using UnsafePinned, or on moving containers without MaybeDangling not invalidating interior references is all in the same boat — should be fixed to not rely on these hidden details of the Rust AM as soon as possible.

None of these things are DB because we endorse doing them. They're DB because all else being equal, we prefer more things being DB. Just because you can doesn't mean you should, and we consistently make the nicer and more straightforward way to accomplish these things available at the same time as or before we canonicalize that the circuitous way is actually guaranteed to work instead of just working by happenstance.


To @Calvin304's primary point about initialization tracking:

I don't know at what point uninitialized bindings are currently diagnosed. However, with the shifts to use THIR unsafeck and MIR borrowck, the trend is to push these analyses later in the compilation pipeline.

MIR does actually know a difference between let and let mut; you can see this when dumping MIR built from surface Rust. But on the other hand, parameters in dumped MIR aren't ever marked mut, and the mir! macro just uses let and doesn't even parse let mut.

I intuitively expect that if/when the language is extended to support piecewise initialization of let, this analysis will happen on THIR; it must happen after types are known and name resolution has occurred in order to ensure no deref coercion happens on a partially uninitialized place, independently of whether we allow piecewise initialization to completely initialize a place.

Thus, once that analysis is done, it basically just becomes an implementation choice whether that information is validated in THIR then discarded or if it's passed along into MIR. We already have StorageLive and StorageDead, so a "StorageFreeze" for non-mut initialization of let is far from unreasonable.

The benefits of immutable-after-init let do exist (and aren't novel, since C++ compilers exploit them for const locals), but it's worth explicitly noting that these benefits would still apply to most let even if we say let can still validly be mutated. A non-mut place can only be mutated if it is used by a raw reference-of expression (addr_of! / &raw const), so it can still be optimized in the absence of that.

For such to exist in the TB model, the "root" tag for let allocation is still "Active", but any place mention (excluding direct place = assignment) doesn't use that tag directly, but instead a child tag, which is frozen for any non-mut bindings.


So, with that in mind, reducing the question of the OP to absurdity, it really does boil down to:

If, given some let place, a developer chooses to write addr_of!(place) instead of the stricter ptr::from_ref(&place) (maybe using coercion), does it still make sense to punish a mistaken write with arbitrarily disastrous UB?

My personal position and answer to the question is the following three-phase conditional:

[^6]: The strongest argument for not using &place would be that the resulting pointer's pseudo-lifetime outlives the enforced borrow lifetime. (However, if the execution is valid, note that no lifetime extension occurs; the borrow lifetime and validity always die together for a shared borrow of a let binding… in the absence of any mutation as allowed by this, anyway.) Any project pedantic enough to care about this should probably be banning reference to pointer coercion and appreciate the syntactic noise (and semantic signal) of using ptr::from_ref.

The points/arguments I make elsewhere throughout this comment are why I agree with Ralf here that the example in the OP should be defined behavior. The most relevant point is that addr_of! is a non-mut place mention that nonetheless doesn't freeze the mentioned place (like a non-raw reference-of would), which I expand on a bit more below.

Separately, I also think an initial intuition can reasonably expect a binding's place mention to behave like *&mut binding (if a mut mention) or *&binding (if a non-mut mention). This doesn't work because of place projection (partial borrows), but up until that point integrates well with how auto(de)ref and deref/index place mutability inference work[^5]. If a developer understands that the purpose of addr_of! is to avoid creating a reference, that same understanding covers why addr_of! shouldn't freeze the mentioned place.

[^5]: Example: the place v[i] is explained as sugar for *Index::index(&v, i) (or the mut version), which after mental inlining turns into essentially *&*<_>::as_ptr(&v).add(i). Sugared place mentions have *& semantics, so it's understandable to expect something similar when mentioning a let binding as a place.


While I absolutely agree with the sentiment of #257[^0], I do wonder whether the perception of addr_of!(place) and &raw const place might differ, since the latter has const in the syntax. Additionally, addr_of! is enough syntactic salt that it's IMHO a strong signal for "I want the weaker semantics," but &raw const place is so convenient that it loses some of that signal[^4] and is easier than writing ptr::from_ref(&place) even when the stronger semantics are wanted.

[^4]: Similar to using raw pointers even when they're known non-null. If the hypothetical "One True Pointer Type" doesn't distinguish "being" mut, then the OTPT version of &raw mut would exclusively determine whether the place mention is mut, making it easier to argue for read-only (i.e. if you want write permissions, use the mut syntax, which despite applying to raw pointers because of coercion, is a bit of a weaker argument when the types differ).

[^0]: It remains extremely unfortunate that (potentially coerced) &mut place as *const _ produces a pointer that's illegal to write through imo, especially since there's no "unused mut" style lint emitted to say it could just be &place instead. The coercion "desugars" into &raw const (*&mut place), so IIUC SB errors and TB allows mutation, as SB uses a "raw" tag kind for &raw const but TB just uses the source's tag.

Drawing from a Rust background, I[^1] don't expect to be able to modify through an addr_of!(binding) pointer[^2], personally. Drawing from a C++ background, on the other hand, my understanding of when it's valid to use const_cast can suggest that addr_of!(binding).cast_mut().write(…) could be valid. But on the other side of that hand, const in C++ applied to the type of the storage place does actually make the place UB to mutate[^3].

[^1]: This is about me personally, not a theoretical I. And probably I'm far from a typical user w.r.t. what I intuitively expect, given my involvement with T-opsem and that Rust's ownership model immediately made intuitive sense to me, allowing me to mostly skip the "fighting the borrow checker" phase when I was first learning Rust.

[^2]: Although, for full clarity, my "first guess" semantic also doesn't have addr_of! freeze the location, either; it's still weaker than ptr::from_ref(&place) and the place could be mutated through a separate pointer to the place. Additionally, for addr_of!(*mut_ptr), my "first guess" heuristic is entirely unsure whether to treat this differently from mut_ptr.cast_const().

[^3]: In the absence of mutable fields, essentially the C++ version of Rust's UnsafeCell.

So in (not particularly useful) C++ terms, the question is whether let place = init; in Rust has C++ semantics more like auto place = init; auto const&& place = (auto&&)place; (actually mutable) or more like auto const place = init; (actually immutable).

Perhaps more usefully (despite being less strictly correspondent), it depends on whether the C++ developer learning Rust sees let place: T and let mut place: T as the moral equivalents of const T place and T place (actually immutable) or of T place and mutable T place (actually mutable). That mut isn't permitted in struct fields should hopefully indicate that Rust mut is closer to the absence of C++ const than the presence of C++ mutable.

As one last wrinkle, though, I've only really considered addr_of!(binding) above, as that's what's relevant to the OP topic. Indirectly relevant, however, is computed places, e.g. addr_of!((*mut_ptr).field) or even addr_of!(*impl_deref). That gets unambiguously into #257 territory, so I won't touch on it further except to say that addr_of! regularly producing valid-for-writes pointers despite being treated (e.g. for borrowck and deref/index mutability inference) as a non-mut place mention makes the concept of addr_of!(let_binding) being valid-for-writes a lot easier to conceptualize. Under this model, mut is truly "just a lint" and only controls whether a mut place mention is allowed, and since addr_of! is a non-mut place mention that doesn't freeze (make immutable) the mentioned place, the writing through the pointer is still permitted unless a different operation has frozen the place.

IMPORTANT TAKEAWAY BEGIN

So the question finally becomes, after defining addr_of!: does assignment to a let binding without mut freeze the assigned-to place (until the place is dropped)? I fail to see any reason for it to (the locality of reasoning for the developer is maintained by safety, not by validity) and a very clear reason for it not to (an errant write still being debuggable behavior) so my vote's that the OP example should be DB.

If you want errors to be detectable, you want them to not be UB. Being DB does not mean you as a developer have to tolerate it; it means the compiler has to tolerate it. Being UB doesn't mean a rushed developer is any less prone to making mistakes; it means that determining that and how they did is miles more difficult.

IMPORTANT TAKEAWAY END


Post-script response to @chorman0773's point: (it took >6h to write this 🙃)

It's already fundamentally the case throughout Rust's semantics that simple inorder translation isn't sufficient. I don't think making one small part of the translation easier should inform any surface language visible decisions. E.g. type inference is a full-function process, so I don't see how capture elision could happen without having done full-function analysis already. Also, it seems like your eager SSA form lowering

Modifying addr_of! to get "immutable after initialization" requires one of two things — either addr_of! always generates a pointer without write permission (what SB does), or addr_of! needs to behave differently based on whether it's referencing a non-mut binding (or pure place projection thereof) or some other place (i.e. a computed place/value or a place that names a (projection from) mut binding). IMHO that seems much worse than the options of either:

The "proper" solution is for lccc to have an xlang optimization pass that replaces alloca with SSA form where possible and desirable. Given that needs to exist anyway if you do any optimization on the xlang form, it seems like the benefit of deferring alloca when lowering MIR to xlang is quite minimal. Especially with how eagerly surface Rust creates references.

The only real potential benefit I see is in capture elision, turning by-reference capture of let bindings into by-value (and by-constant-value) capture … except since the capture is by-ref, the by-ref capture freezes the place, so mutation of the place is excluded by the capture semantics, and doesn't need to be excluded by the place semantics.


Final afterthought: the above prompted me to think about how closure capture is dependent on mut annotation. This is just another case of addr_of! being a non-mut place mention, but capture rules result in that even if addr_of!(x) gets write-capable provenance, even if x is bound with mut, (|| addr_of!(x))() does not, because that's semantically equivalent to ({ let ref_x = &x; move || addr_of!(*ref_x))().

I don't think there's a fair alternative which still captures a borrow lifetime. That we do precise field capture for closures limits the impact (i.e. capturing for addr_of!(place.field) won't ever impact pointers to the rest of place), but it does feel kind of footgun-y. (FWIW, a pure place mention (e.g. _ = x) doesn't capture x at all.)

chorman0773 commented 4 months ago

The "proper" solution is for lccc to have an xlang optimization pass that replaces alloca with SSA form where possible and desirable. Given that needs to exist anyway if you do any optimization on the xlang form, it seems like the benefit of deferring alloca when lowering MIR to xlang is quite minimal. Especially with how eagerly surface Rust creates references.

Note that the deference happens in the MIR level (lccc's rust frontend has its own MIR). It actually makes some reasoning (like init-tracking) easier as well, since the lowering just inherently will error on an uninit memory location. This is then lowered to using the xlang stack (note, not the call stack at the hardware level) through use of pivot and dup operations (this translation step is cursed, but useful). This translation is necessary to make sure parameters in the THIR are consistent with other bindings (xlang recieves all function parameters on the incoming stack of the first basic block, not as locals). The translation itself is not impacted by either resolution, but without them being "immutable after init", one of two things must be true:

  1. I can't lower alloca const T <val> to a define const at the xlang level, or
  2. I cannot blindly generate the appropriate alloca solely from the THIR Declaration Mutability, and need to take into account the usage of the place (which violates linearty, a property I rely on a lot in all of the lowering steps).

I'd personally choose the former, though I'd rather have neither constraint.

As to the capture rules, the main issue is I can't have local optimization-dependent type layout - and I need to define type layout very exactly, in terms of (post-mono, fully-inferred) surface rust (I cannot refer to dynamic-only semantics). Closures are no exception, and in fact, their layout rules for a backbone for most of the anonymous types, given that generators and async blocks/async functions use the same capture-for-layout-purposes rules.

CAD97 commented 4 months ago

Although I'm certain lccc isn't particularly enthused about it, since you're trying so hard to have a more stable ABI than Rustc provides, the semantics of Rust should not be beholden to implementations that want to provide stronger guarantees than provided by the language. If you'll forgive the hyperbole, this is how a lang gets into the pit that C and C++ find themselves in — the language provides no guarantees for ABI (has no concept of compiled objects, only linking based on surface language semantics), but can't make otherwise desirable changes because it might require a deliberately fragile compiler implementation to break ABI. Because Rust layout and ABI is unspecified, it's entirely okay if the language relies on that to be able to get reasonable performance characteristics.

And anyway, if you want to specify the captures of a witness table in terms of surface language semantics, the obviously correct answer is to just include each reference that gets captured. Attempting to do stably observable guaranteed promotion is a poor idea. Rustc does some of a similar promotion where &promotable can be extended to be 'static, and that we promote some #[rustc_promotable] tagged functions is generally seen as a mistake in retrospect. If a developer wants constant promotion, they can use const to request it instead of stack local semantics. Furthermore, I don't see promotion of captures of let bindings happening often. It's distinctly not sufficient to have "captured by-ref, has a constant, promotable value" and do promotion, because if the capture is referenced from within the closure and the address is inspected, that address had better be equal to the address of the captured binding, because those are the semantics of by-ref binding capture. You can promote any by-move captures, but in that case whether the binding is mutable is irrelevant.

In fact, I don't think rustc ever optimizes closure captures beyond the naive capture semantics, even for capturing every field of a binding independently. And while the size of async does get complained about (but more because of a lack of stack binding space optimization, not because of captures), I've never seen any complaints about the size of normal closures.

TL;DR: While alternative implementations can and should inform decisions, those alternative implementations choosing to tie their own feet more than necessary shouldn't restrict the main language from benefitting from the flexibility it deliberately reserves.

(A plea to "lccc Rust" — at least for anonymous closure/async types, don't stably specify the layout by default, and require annotation of e.g. #[repr(lcccv1)]. Even in the face of dynamic linking, only one compilation unit needs to fully know the layout — the one that defines it. Other compilation units only need the external shape to be predictable, not the monomorphic one.)

workingjubilee commented 4 months ago

@CAD97 I too drafted a much longer explanation, but the part of the problem is that I understand everything you're saying, and the actual communication gap is that scale is not an isolated quantity of the discussion, to be torn away so someone can refute it in isolation.

The problem is that in my experience, the code doesn't necessarily become more sound if the mutation is permitted, because later code may be relying on that immutability. For, yes, soundness reasons. "Everything is a big mesh of horrible, hard-to-understand, entangled state, with unclear invariants" is exactly why I have trouble explaining it. That is my point about scale, and about this behavior being so counterintuitive it can undermine the code around it. Allowing mutation lets the error become more nonlocal, thus harder to find, as the mutation means it can flow downstream until it hits something that has become unsound because values changed. For every source line in a toy example I provide, there may be hundreds separating them... if they're even in the same crate or repo... in the real thing.

People by and large won't set a breakpoint on the variable changing if they never imagine it can be changed.

I would genuinely rather the composited binaries try to do something that makes valgrind scream at the impossibility, immediately. Because Miri isn't the only game in town. As annoying as they may be, it's very easy to pop gdb and examine a core dump of a segfault, too (...the immediate backtrace and register dump, anyway, the rest can be somewhat "???").

workingjubilee commented 4 months ago

The only real "optimization" available that I am aware of in the separate compilation case...

...we did all agree that this is most likely to occur in the case of FFI, right?...

...where an address must be revealed to the world, and that must have a legible value at that location, or the program simply cannot work, neither as the abstract machine understands it nor as the programmer understands it... is to move the data into read-only memory. Attempts to write to that location will generally violate virtual memory protection and immediately trigger SIGSEGV or the equivalent... Windows still calls it EXCEPTION_ACCESS_VIOLATION, right?... which then reveals the error immediately, usually including the relevant address.

Unconditionally permitting the mutation does not allow this transformation, allowing errors to be latent in the actual binary.

chorman0773 commented 4 months ago

(A plea to "lccc Rust" — at least for anonymous closure/async types, don't stably specify the layout by default, and require annotation of e.g. #[repr(lcccv1)]. Even in the face of dynamic linking, only one compilation unit needs to fully know the layout — the one that defines it. Other compilation units only need the external shape to be predictable, not the monomorphic one.)

As a note, I can think of at least one place where two different translation units may need to come up with the layout of an anonymous type de novo, that being inlining/generics. This is because now two (or more) different compilation units can contain the definition, and need to agree in order to be interconvertible. Especially in the generics case, because the captures' list can be built partially, but not entirely (by-ref to by-copy lifting requires knowing the monomorphic type, this optimization is present in v0, for types at most the size of a pointer) so the compiler on both sides of this must agree on the same final steps at least (ie. on the small-copy capture optimization) - this is true even if the compiler is the exact same build of lccc. The case in question is something like

fn mk_sized_array<T>(_: &T) -> [u8; core::mem::size_of::<T>()]{
    [0xFF; core::mem::size_of::<T>()]
}
pub fn foo<T: Copy + Display>(x: T) -> impl Any{
    let f = || println!("{}", {x});
    mk_sized_array(&f)
}

Two different calls to foo::<u32>(5) must be able to come up with the fact that typeof(f) has size 4, otherwise when the comdat groups are collapsed, one call site (presuming no inlining takes place) will have the incorrect size for the return value (the caller could transmute the array to u32, and the compiler could assume that it is fine to promote eax to rax, because the top bits of rax are all zeroes).

Monadic-Cat commented 4 months ago

I think that "erroneous behavior" classification is worth exploring more. It seems nice to have a way to forbid things in a way that the compiler can insert checks for, instead of the "I assume it never happens and generate code that way" nasal demons of UB. Like, "this is forbidden and I'm going to check it when I can" sounds way better for things where the reason it's forbidden is to make code more intelligible, instead of more optimizable.

What would the performance impact be of placing fully initialized let places in read-only memory, anyway? Maybe this can be done by default in debug mode.

chorman0773 commented 4 months ago

What would the performance impact be of placing fully initialized let places in read-only memory, anyway? Maybe this can be done by default in debug mode.

Note that it would specifically have to be put into a static - there's no (easy or cheap) way to just make part of the stack read only. I think the AAAA solution tells us that it's fine for objects to share an address, as long as they don't interfere with each other, so there wouldn't be an issue with promoting an immutable let binding with a constant initializer to a static in theory.

zachs18 commented 3 months ago

Since I don't see it mentioned yet, the Rust Reference currently documents that "Mutating immutable bytes" is UB, and that

The bytes owned by an immutable binding are immutable, unless those bytes are part of an UnsafeCell<U>.

Note however that the Reference is explicitly non-normative and subject to change.