rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.23k stars 12.7k forks source link

Can I create and use a `*mut` to an uninitialized local? #90718

Open ecstatic-morse opened 2 years ago

ecstatic-morse commented 2 years ago

The following code does not compile on current nightly or stable (1.56.1).

fn main() {
    let mut x: i32;
    println!("{:p}", std::ptr::addr_of_mut!(x));
}
error[E0381]: borrow of possibly-uninitialized variable: `x`
 --> src/main.rs:3:22
  |
3 |     println!("{:p}", std::ptr::addr_of_mut!(x));
  |                      ^^^^^^^^^^^^^^^^^^^^^^^^^ use of possibly-uninitialized `x`
  |
  = note: this error originates in the macro `std::ptr::addr_of_mut` (in Nightly builds, run with -Z macro-backtrace for more info)

I would expect this to work, since part of the rationale for the addr_of macro and raw reference operator (#64490) was to allow creation of pointers to uninitialized memory. On the other hand, I don't think the behavior of locals that are only assigned/accessed indirectly is fully specified (though I don't think my example is UB). Such locals will never be dropped, which might be surprising, and they cannot be used directly, even after being written via the pointer, without encountering the same error.

MaybeUninit is basically always a better choice for these kinds of things, so it shouldn't be of much importance to users. However, we should avoid depending on the illegality of such code inside the compiler–a more complex example arose while discussing region inference–unless we have made a conscious decision to forbid it. Alternatively, if interacting with uninitialized locals is well-defined, we should probably make addr_of work on them.

ecstatic-morse commented 2 years ago

cc @rust-lang/wg-unsafe-code-guidelines

RalfJung commented 2 years ago

However, we should avoid depending on the illegality of such code inside the compiler

Why that? I agree that the specification should be ready to say what programs like this behave like, at least in post-drop-elaboration MIR (and the spec in my head indeed allows this code, just like the equivalent code using MaybeUninit). OTOH it seems like a waste of time to describe how drop elaboration behaves on code like this.

I think it is perfectly fine to disallow code that tries to handle uninit memory without MaybeUninit. I don't see a good motivation for allowing such code.

a more complex example arose while discussing region inference

This seems to be the motivating factor here. Is this an example that the compiler accepts that raises similar questions? If yes, could you describe that example (or a simplified version of it)?

ecstatic-morse commented 2 years ago

Is this an example that the compiler accepts that raises similar questions?

Not that I can think of. This type of code was UB prior to &raw AFAIK. For now, you cannot access a variable that has been declared but not defined, even to take its address. My question is: can we rely on this going forward? You seem conflicted:

the spec in my head indeed allows this code, just like the equivalent code using MaybeUninit

I think it is perfectly fine to disallow code that tries to handle uninit memory without MaybeUninit.

Both the borrow checker and Polonius use variable liveness in some cases to determine whether an outlives relationship between two regions will result in an error. However, in the following example, a borrow of a local variable is associated with a placeholder lifetime via another local, x, that is never live.

fn foo<'a>(_: &'a i32) -> &'a i32 {
    let y = 42;
    let x: &'a i32;

    let px = std::ptr::addr_of_mut!(x);
    unsafe {
        std::ptr::write(px, &y);
        std::ptr::read(px)
    }
}

This code is not UB, and unsafe does not turn off the borrow checker, so I think this code should not compile. However, for the borrow checker to emit an error here, we need to ignore variable liveness when considering outlives relationships arising from type ascriptions on those variables. rustc does this today using Locations::All, but we'd like to remove it (see #50938 and rust-lang/polonius#24), since it's inefficient (especially in Poloinus) and prone to misuse (it's easier to pass Locations::All than to remember the locations where an OutlivesConstraint need apply). I don't think we can unless we handle the above case explicitly, though.

Alternatively, we could decide that it's illegal to take the address of an uninitialized local (or to write to that address). That way, we don't have to handle cases like this in the borrow checker. It seems we have already done this, but I doubt it was a conscious decision, hence this issue.

We could also decide that this particular use of unsafe bypasses the borrow checker (there are much easier ways to materialize a lifetime from scratch), but it would be nice to get some general guidance here.

bjorn3 commented 2 years ago

Let's say you do

fn main() {
    let mut x: Vec<u32>;
    unsafe {
        std::ptr::write(std::ptr::addr_of_mut!(x), vec![0]);
    }
    x = vec[1];
}

Should the drop impl of x run when assigning it at x = vec![1] or should it leak? In case of the former, that would make std::ptr::addr_of_mut!() without writing to it UB, if the later, that would be pretty prone to leaking.

ecstatic-morse commented 2 years ago

@bjorn3, that's true, but it's a problem we already have today. Consider the following:

fn main() {
    let mut x = vec![0u32];
    let px = &mut x as *mut _;

    let y = x;

    // unsafe { *px = vec![1]; } // Would drop old value in `x` that has been moved into `y`, see below.
    unsafe { std::ptr::write(px, vec![1]) }

   // `vec![1]` will leak, since `x` is not initialized according to `rustc`.
}
bjorn3 commented 2 years ago

I would have expected that px would be invalidated when x got moved, but instead miri reports:

error: Undefined Behavior: pointer to alloc1131 was dereferenced after this allocation got freed
   --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:105:14
    |
105 |     unsafe { __rust_dealloc(ptr, layout.size(), layout.align()) }
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pointer to alloc1131 was dereferenced after this allocation got freed
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

    = note: inside `std::alloc::dealloc` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:105:14
    = note: inside `<std::alloc::Global as std::alloc::Allocator>::deallocate` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/alloc.rs:242:22
    = note: inside `<alloc::raw_vec::RawVec<u32> as std::ops::Drop>::drop` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/raw_vec.rs:523:22
    = note: inside `std::ptr::drop_in_place::<alloc::raw_vec::RawVec<u32>> - shim(Some(alloc::raw_vec::RawVec<u32>))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:188:1
    = note: inside `std::ptr::drop_in_place::<std::vec::Vec<u32>> - shim(Some(std::vec::Vec<u32>))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:188:1
note: inside `main` at src/main.rs:9:1
   --> src/main.rs:9:1
    |
9   | }
    | ^
ecstatic-morse commented 2 years ago

Oh, *px = ... requires that *px is initialized. I wanted ptr::write.

Yeah, does stacked borrows consider moves?

bjorn3 commented 2 years ago

Not sure.

RalfJung commented 2 years ago

This code is not UB, and unsafe does not turn off the borrow checker, so I think this code should not compile.

Are both of these 'not' intentional? I don't quite understand the causal chain here.

But without a let mut x this certainly seems dubious.

However, for the borrow checker to emit an error here, we need to ignore variable liveness when considering outlives relationships arising from type ascriptions on those variables. rustc does this today using Locations::All, but we'd like to remove it (see #50938 and rust-lang/polonius#24), since it's inefficient (especially in Poloinus) and prone to misuse (it's easier to pass Locations::All than to remember the locations where an OutlivesConstraint need apply). I don't think we can unless we handle the above case explicitly, though.

I think it's fine for the borrow checker to reject such code even if executing that code might or might not be UB.

Whether the code is UB or not interacts not only with drop elaboration but also with the StorageLive annotations -- a variable that is never initialized might not even ever be actually allocated on the stack, so it doesn't have a well-defined address. I think I'd prefer to dodge that question by making it impossible to even write such programs.

Yeah, does stacked borrows consider moves?

No. But we might want to consider a move as 'writing uninit to the source of the move', which should then be considered a write for Stacked Borrows as well, which would clearly invalidate px here. Cc https://github.com/rust-lang/unsafe-code-guidelines/issues/188