oxidecomputer / phbl

Pico Host Boot Loader
Mozilla Public License 2.0
98 stars 7 forks source link

ptr::null{_mut} + with_addr + dereference -> UB #22

Closed saethlin closed 1 year ago

saethlin commented 1 year ago

This code pattern is UB: https://github.com/oxidecomputer/phbl/blob/b568432e2e826711ef577c01066d43667ae56b3f/src/uart.rs#L444-L445

Background: Provenance is shadow state of pointers (in CHERI and Miri it actually exists at runtime), which is normally the identity of the allocation which a pointer/reference points to when it is created, but probably more. Dereferencing a pointer when its address points to memory which is outside of the region specified by its provenance is UB.

The pointer returned by core::ptr::null_mut() has no provenance. This code then uses with_addr, which is a specially-crafted API which manipulates the address of a pointer without changing its provenance. So the pointer which comes out of here now has a non-zero address, but it is not valid to do any reads or writes through this pointer, or to turn it into a reference by any means. The basic validity requirements of a reference include that their address fall within their provenance (as well as being non-null and aligned).

So the pattern of doing NULL.with_addr(self.addr()) is technically fine by itself, but a bit odd. But then this code dereferences the pointer to reborrow it into a mutable reference. And there we have UB.

This is particularly concerning, because while LLVM's understanding of provenance is a bit... unclear, it is pretty trigger-happy with null pointers, probably on account of how much UB surrounds them in C: https://github.com/rust-lang/rust/pull/96538 I don't want to spread FUD, but this is just one of those areas where I really am not comfortable with the hand-wave I've seen from some people that "it's UB according to Stacked Borrows but LLVM won't exploit that".

So, some suggestions: If you want a pointer with some particular address value which you are never going to dereference, use ptr::invalid{_mut}. The hope of this API is that the name reminds you that this pointer is not okay to dereference. If you have special knowledge of some address which Rust did not allocate but which is perfectly valid to access on the machine, use ptr::from_exposed_addr{_mut}[^1]. If you do not have strict_provenance available (because you're on stable) use an as cast from integer to pointer type (or sptr) [^2].

[^1]: I'm aware the docs don't exactly back this up, and realizing that we should probably improve them now... [^2]: Miri will be confused by MMIO addresses no matter what you do, but I have a plan to at least give you a way to tell Miri about MMIO memory.


#![feature(strict_provenance)]
fn main() {
    const NULL: *const usize = core::ptr::null();
    let a = 0usize;
    let ptr = &a as *const usize;
    unsafe {
        let _x = &*NULL.with_addr(ptr.addr());
    }
}
error: Undefined Behavior: dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance)
 --> demo.rs:7:18
  |
7 |         let _x = &*NULL.with_addr(ptr.addr());
  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance)
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
  = note: BACKTRACE:
  = note: inside `main` at demo.rs:7:18
riking commented 1 year ago

Concretely: continue to use as-casts for now instead of trying to use the provenance of NULL.

dancrossnyc commented 1 year ago

This code pattern is UB:

https://github.com/oxidecomputer/phbl/blob/b568432e2e826711ef577c01066d43667ae56b3f/src/uart.rs#L444-L445

Background: Provenance is shadow state of pointers (in CHERI and Miri it actually exists at runtime), which is normally the identity of the allocation which a pointer/reference points to when it is created, but probably more. Dereferencing a pointer when its address points to memory which is outside of the region specified by its provenance is UB.

The pointer returned by core::ptr::null_mut() has no provenance. This code then uses with_addr, which is a specially-crafted API which manipulates the address of a pointer without changing its provenance. So the pointer which comes out of here now has a non-zero address, but it is not valid to do any reads or writes through this pointer, or to turn it into a reference by any means. The basic validity requirements of a reference include that their address fall within their provenance (as well as being non-null and aligned).

So the pattern of doing NULL.with_addr(self.addr()) is technically fine by itself, but a bit odd. But then this code dereferences the pointer to reborrow it into a mutable reference. And there we have UB.

This is particularly concerning, because while LLVM's understanding of provenance is a bit... unclear, it is pretty trigger-happy with null pointers, probably on account of how much UB surrounds them in C: rust-lang/rust#96538 I don't want to spread FUD, but this is just one of those areas where I really am not comfortable with the hand-wave I've seen from some people that "it's UB according to Stacked Borrows but LLVM won't exploit that".

So, some suggestions: If you want a pointer with some particular address value which you are never going to dereference, use ptr::invalid{_mut}. The hope of this API is that the name reminds you that this pointer is not okay to dereference. If you have special knowledge of some address which Rust did not allocate but which is perfectly valid to access on the machine, use ptr::from_exposed_addr{_mut}1. If you do not have strict_provenance available (because you're on stable) use an as cast from integer to pointer type (or sptr) 2.

#![feature(strict_provenance)]
fn main() {
    const NULL: *const usize = core::ptr::null();
    let a = 0usize;
    let ptr = &a as *const usize;
    unsafe {
        let _x = &*NULL.with_addr(ptr.addr());
    }
}
error: Undefined Behavior: dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance)
 --> demo.rs:7:18
  |
7 |         let _x = &*NULL.with_addr(ptr.addr());
  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereferencing pointer failed: 0x276f0[noalloc] is a dangling pointer (it has no provenance)
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
  = note: BACKTRACE:
  = note: inside `main` at demo.rs:7:18

Footnotes

  1. I'm aware the docs don't exactly back this up, and realizing that we should probably improve them now...
  2. Miri will be confused by MMIO addresses no matter what you do, but I have a plan to at least give you a way to tell Miri about MMIO memory.

Thank you very much for the detailed issue report; this is extremely helpful.

phbl does make use of strict provenance, and I run the unit tests under miri in CI, but there probably isn't a test that covers this case (unit testing MMIO is hard).

We've used this pattern a few places here, simply for wont of something better, but I've removed the other uses: for newly-mapped entries in the loader, I can simply ask the page table for its provenance (and it can even validate that it maps the requested address! I haven't quite done that yet, but it's easy code to write), but the UART code is harder since we do device initialization using the minimal virtual memory mappings created by the assembly bootstrap before we've reinitialized the address space. That is, I don't even have an address space to ask yet. :-/

I'm curious to see how the docs for from_exposed_addr evolve; the requirement to have previously emitted some provenance via expose_addr makes this difficult, as it's not clear what we could expose or how before we get access to a pointer to expose. In the UART code, even more so. However, it occurs to me that I can provide a symbol from the linker that I can use to get a pointer to provide provenance for the MMIO region, which I've now done.

Thanks again! This was super useful/instructive.

dancrossnyc commented 1 year ago

I believe that https://github.com/oxidecomputer/phbl/pull/23 may address the issues raised here. @saethlin I'd be grateful if you would have a look!