Open ecstatic-morse opened 2 years ago
cc @rust-lang/wg-unsafe-code-guidelines
However, we should avoid depending on the illegality of such code inside the compiler
Why that?
I agree that the specification should be ready to say what programs like this behave like, at least in post-drop-elaboration MIR (and the spec in my head indeed allows this code, just like the equivalent code using MaybeUninit
). OTOH it seems like a waste of time to describe how drop elaboration behaves on code like this.
I think it is perfectly fine to disallow code that tries to handle uninit memory without MaybeUninit
. I don't see a good motivation for allowing such code.
a more complex example arose while discussing region inference
This seems to be the motivating factor here. Is this an example that the compiler accepts that raises similar questions? If yes, could you describe that example (or a simplified version of it)?
Is this an example that the compiler accepts that raises similar questions?
Not that I can think of. This type of code was UB prior to &raw
AFAIK. For now, you cannot access a variable that has been declared but not defined, even to take its address. My question is: can we rely on this going forward? You seem conflicted:
the spec in my head indeed allows this code, just like the equivalent code using
MaybeUninit
I think it is perfectly fine to disallow code that tries to handle uninit memory without
MaybeUninit
.
Both the borrow checker and Polonius use variable liveness in some cases to determine whether an outlives relationship between two regions will result in an error. However, in the following example, a borrow of a local variable is associated with a placeholder lifetime via another local, x
, that is never live.
fn foo<'a>(_: &'a i32) -> &'a i32 {
let y = 42;
let x: &'a i32;
let px = std::ptr::addr_of_mut!(x);
unsafe {
std::ptr::write(px, &y);
std::ptr::read(px)
}
}
This code is not UB, and unsafe
does not turn off the borrow checker, so I think this code should not compile. However, for the borrow checker to emit an error here, we need to ignore variable liveness when considering outlives relationships arising from type ascriptions on those variables. rustc
does this today using Locations::All
, but we'd like to remove it (see #50938 and rust-lang/polonius#24), since it's inefficient (especially in Poloinus) and prone to misuse (it's easier to pass Locations::All
than to remember the locations where an OutlivesConstraint
need apply). I don't think we can unless we handle the above case explicitly, though.
Alternatively, we could decide that it's illegal to take the address of an uninitialized local (or to write to that address). That way, we don't have to handle cases like this in the borrow checker. It seems we have already done this, but I doubt it was a conscious decision, hence this issue.
We could also decide that this particular use of unsafe
bypasses the borrow checker (there are much easier ways to materialize a lifetime from scratch), but it would be nice to get some general guidance here.
Let's say you do
fn main() {
let mut x: Vec<u32>;
unsafe {
std::ptr::write(std::ptr::addr_of_mut!(x), vec![0]);
}
x = vec[1];
}
Should the drop impl of x
run when assigning it at x = vec![1]
or should it leak? In case of the former, that would make std::ptr::addr_of_mut!()
without writing to it UB, if the later, that would be pretty prone to leaking.
@bjorn3, that's true, but it's a problem we already have today. Consider the following:
fn main() {
let mut x = vec![0u32];
let px = &mut x as *mut _;
let y = x;
// unsafe { *px = vec![1]; } // Would drop old value in `x` that has been moved into `y`, see below.
unsafe { std::ptr::write(px, vec![1]) }
// `vec![1]` will leak, since `x` is not initialized according to `rustc`.
}
I would have expected that px
would be invalidated when x
got moved, but instead miri reports:
error: Undefined Behavior: pointer to alloc1131 was dereferenced after this allocation got freed
--> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:105:14
|
105 | unsafe { __rust_dealloc(ptr, layout.size(), layout.align()) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pointer to alloc1131 was dereferenced after this allocation got freed
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `std::alloc::dealloc` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:105:14
= note: inside `<std::alloc::Global as std::alloc::Allocator>::deallocate` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:242:22
= note: inside `<alloc::raw_vec::RawVec<u32> as std::ops::Drop>::drop` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec.rs:523:22
= note: inside `std::ptr::drop_in_place::<alloc::raw_vec::RawVec<u32>> - shim(Some(alloc::raw_vec::RawVec<u32>))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:188:1
= note: inside `std::ptr::drop_in_place::<std::vec::Vec<u32>> - shim(Some(std::vec::Vec<u32>))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:188:1
note: inside `main` at src/main.rs:9:1
--> src/main.rs:9:1
|
9 | }
| ^
Oh, *px = ...
requires that *px
is initialized. I wanted ptr::write
.
Yeah, does stacked borrows consider moves?
Not sure.
This code is not UB, and unsafe does not turn off the borrow checker, so I think this code should not compile.
Are both of these 'not' intentional? I don't quite understand the causal chain here.
But without a let mut x
this certainly seems dubious.
However, for the borrow checker to emit an error here, we need to ignore variable liveness when considering outlives relationships arising from type ascriptions on those variables. rustc does this today using Locations::All, but we'd like to remove it (see #50938 and rust-lang/polonius#24), since it's inefficient (especially in Poloinus) and prone to misuse (it's easier to pass Locations::All than to remember the locations where an OutlivesConstraint need apply). I don't think we can unless we handle the above case explicitly, though.
I think it's fine for the borrow checker to reject such code even if executing that code might or might not be UB.
Whether the code is UB or not interacts not only with drop elaboration but also with the StorageLive
annotations -- a variable that is never initialized might not even ever be actually allocated on the stack, so it doesn't have a well-defined address. I think I'd prefer to dodge that question by making it impossible to even write such programs.
Yeah, does stacked borrows consider moves?
No. But we might want to consider a move as 'writing uninit to the source of the move', which should then be considered a write for Stacked Borrows as well, which would clearly invalidate px
here. Cc https://github.com/rust-lang/unsafe-code-guidelines/issues/188
The following code does not compile on current nightly or stable (1.56.1).
I would expect this to work, since part of the rationale for the
addr_of
macro and raw reference operator (#64490) was to allow creation of pointers to uninitialized memory. On the other hand, I don't think the behavior of locals that are only assigned/accessed indirectly is fully specified (though I don't think my example is UB). Such locals will never be dropped, which might be surprising, and they cannot be used directly, even after being written via the pointer, without encountering the same error.MaybeUninit
is basically always a better choice for these kinds of things, so it shouldn't be of much importance to users. However, we should avoid depending on the illegality of such code inside the compiler–a more complex example arose while discussing region inference–unless we have made a conscious decision to forbid it. Alternatively, if interacting with uninitialized locals is well-defined, we should probably makeaddr_of
work on them.