Closed saethlin closed 1 year ago
Concretely: continue to use as-casts for now instead of trying to use the provenance of NULL.
This code pattern is UB:
Background: Provenance is shadow state of pointers (in CHERI and Miri it actually exists at runtime), which is normally the identity of the allocation which a pointer/reference points to when it is created, but probably more. Dereferencing a pointer when its address points to memory which is outside of the region specified by its provenance is UB.
The pointer returned by
core::ptr::null_mut()
has no provenance. This code then useswith_addr
, which is a specially-crafted API which manipulates the address of a pointer without changing its provenance. So the pointer which comes out of here now has a non-zero address, but it is not valid to do any reads or writes through this pointer, or to turn it into a reference by any means. The basic validity requirements of a reference include that their address fall within their provenance (as well as being non-null and aligned).So the pattern of doing
NULL.with_addr(self.addr())
is technically fine by itself, but a bit odd. But then this code dereferences the pointer to reborrow it into a mutable reference. And there we have UB.This is particularly concerning, because while LLVM's understanding of provenance is a bit... unclear, it is pretty trigger-happy with null pointers, probably on account of how much UB surrounds them in C: rust-lang/rust#96538 I don't want to spread FUD, but this is just one of those areas where I really am not comfortable with the hand-wave I've seen from some people that "it's UB according to Stacked Borrows but LLVM won't exploit that".
So, some suggestions: If you want a pointer with some particular address value which you are never going to dereference, use
ptr::invalid{_mut}
. The hope of this API is that the name reminds you that this pointer is not okay to dereference. If you have special knowledge of some address which Rust did not allocate but which is perfectly valid to access on the machine, useptr::from_exposed_addr{_mut}
1. If you do not havestrict_provenance
available (because you're on stable) use anas
cast from integer to pointer type (orsptr
) 2.#![feature(strict_provenance)] fn main() { const NULL: *const usize = core::ptr::null(); let a = 0usize; let ptr = &a as *const usize; unsafe { let _x = &*NULL.with_addr(ptr.addr()); } }
error: Undefined Behavior: dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance) --> demo.rs:7:18 | 7 | let _x = &*NULL.with_addr(ptr.addr()); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance) | = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information = note: BACKTRACE: = note: inside `main` at demo.rs:7:18
Footnotes
Thank you very much for the detailed issue report; this is extremely helpful.
phbl does make use of strict provenance, and I run the unit tests under miri in CI, but there probably isn't a test that covers this case (unit testing MMIO is hard).
We've used this pattern a few places here, simply for wont of something better, but I've removed the other uses: for newly-mapped entries in the loader, I can simply ask the page table for its provenance (and it can even validate that it maps the requested address! I haven't quite done that yet, but it's easy code to write), but the UART code is harder since we do device initialization using the minimal virtual memory mappings created by the assembly bootstrap before we've reinitialized the address space. That is, I don't even have an address space to ask yet. :-/
I'm curious to see how the docs for from_exposed_addr
evolve; the requirement to have previously emitted some provenance via expose_addr
makes this difficult, as it's not clear what we could expose or how before we get access to a pointer to expose. In the UART code, even more so. However, it occurs to me that I can provide a symbol from the linker that I can use to get a pointer to provide provenance for the MMIO region, which I've now done.
Thanks again! This was super useful/instructive.
I believe that https://github.com/oxidecomputer/phbl/pull/23 may address the issues raised here. @saethlin I'd be grateful if you would have a look!
This code pattern is UB: https://github.com/oxidecomputer/phbl/blob/b568432e2e826711ef577c01066d43667ae56b3f/src/uart.rs#L444-L445
Background: Provenance is shadow state of pointers (in CHERI and Miri it actually exists at runtime), which is normally the identity of the allocation which a pointer/reference points to when it is created, but probably more. Dereferencing a pointer when its address points to memory which is outside of the region specified by its provenance is UB.
The pointer returned by
core::ptr::null_mut()
has no provenance. This code then useswith_addr
, which is a specially-crafted API which manipulates the address of a pointer without changing its provenance. So the pointer which comes out of here now has a non-zero address, but it is not valid to do any reads or writes through this pointer, or to turn it into a reference by any means. The basic validity requirements of a reference include that their address fall within their provenance (as well as being non-null and aligned).So the pattern of doing
NULL.with_addr(self.addr())
is technically fine by itself, but a bit odd. But then this code dereferences the pointer to reborrow it into a mutable reference. And there we have UB.This is particularly concerning, because while LLVM's understanding of provenance is a bit... unclear, it is pretty trigger-happy with null pointers, probably on account of how much UB surrounds them in C: https://github.com/rust-lang/rust/pull/96538 I don't want to spread FUD, but this is just one of those areas where I really am not comfortable with the hand-wave I've seen from some people that "it's UB according to Stacked Borrows but LLVM won't exploit that".
So, some suggestions: If you want a pointer with some particular address value which you are never going to dereference, use
ptr::invalid{_mut}
. The hope of this API is that the name reminds you that this pointer is not okay to dereference. If you have special knowledge of some address which Rust did not allocate but which is perfectly valid to access on the machine, useptr::from_exposed_addr{_mut}
[^1]. If you do not havestrict_provenance
available (because you're on stable) use anas
cast from integer to pointer type (orsptr
) [^2].[^1]: I'm aware the docs don't exactly back this up, and realizing that we should probably improve them now... [^2]: Miri will be confused by MMIO addresses no matter what you do, but I have a plan to at least give you a way to tell Miri about MMIO memory.