rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
658 stars 57 forks source link

Stacked Borrows does not support &UnsafeCell pointing to read-only data #303

Open oconnor663 opened 3 years ago

oconnor663 commented 3 years ago

I'm playing with the idea of a ReadOnlyCell<T> type, like a Cell<T> that only supports reading. The (kind of academic) goal of this type is to allow conversions from either &T or &Cell<T> into &ReadOnlyCell<T>, similar to how Cell::from_mut converts from &mut T to &Cell<T>. Here's the example code (playground link):

use std::cell::{Cell, UnsafeCell};

#[repr(transparent)]
pub struct ReadOnlyCell<T>(UnsafeCell<T>);

impl<T> ReadOnlyCell<T> {
    pub fn new(val: T) -> Self {
        Self(UnsafeCell::new(val))
    }

    pub fn from_ref(t: &T) -> &Self {
        // SAFETY: &ReadOnlyCell<T> is strictly less capable than &T
        unsafe { &*(t as *const T as *const Self) }
    }

    pub fn from_cell_ref(t: &Cell<T>) -> &Self {
        // SAFETY: &ReadOnlyCell<T> is strictly less capable than &Cell<T>
        unsafe { &*(t as *const Cell<T> as *const Self) }
    }
}

impl<T: Copy> ReadOnlyCell<T> {
    pub fn get(&self) -> T {
        unsafe { *self.0.get() }
    }
}

fn main() {
    let my_int = &42;
    let read_only_cell = ReadOnlyCell::from_ref(my_int);
    assert_eq!(read_only_cell.get(), 42);
}

This fails Miri:

error: Undefined Behavior: trying to reborrow for SharedReadWrite at alloc45, but parent tag <untagged> does not have an appropriate item in the borrow stack
  --> src/main.rs:13:18
   |
13 |         unsafe { &*(t as *const T as *const Self) }
   |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc45, but parent tag <untagged> does not have an appropriate item in the borrow stack
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information

   = note: inside `ReadOnlyCell::<i32>::from_ref` at src/main.rs:13:18
note: inside `main` at src/main.rs:30:26
  --> src/main.rs:30:26
   |
30 |     let read_only_cell = ReadOnlyCell::from_ref(my_int);
   |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It seems like Miri is telling me that converting &T to &UnsafeCell<T> is inherently UB. That's surprising to me. Obviously writing to that UnsafeCell would be UB. But here all I think I'm doing is telling the compiler not to treat the pointer as noalias, which is sort of like converting &T to *const T, which is legal. Am I thinking about this the right way?

RalfJung commented 3 years ago

Hm, yeah that seems like a fair point. Current Stacked Borrows cannot really handle this case -- it considers just the type of the pointer (roughly &UnsafeCell in this case) to compute the required permission, but cannot take into account the previous permission of the pointer.

This is a fundamental problem with Stacked Borrows, not its implementation in Miri, so I moved the issue. Thanks for bringing this up!

RalfJung commented 3 years ago

This is related to https://github.com/rust-lang/unsafe-code-guidelines/issues/227 and possibly https://github.com/rust-lang/unsafe-code-guidelines/issues/133.

hudson-ayers commented 2 years ago

As mentioned in #314 , we have also run into this issue in Tock. Our original solution was basically casting &[T] to &[UnsafeCell<T>], but this causes miri to complain.

One suggested alternative is to transmute the initialized memory we recieve to MaybeUninit<T> instead, since MaybeUninit<T> does not allow the compiler to assume multiple reads of the underlying type will be fixed, and there does not seem to be a requirement (like for UnsafeCell) that the memory being placed inside a MaybeUninit point to writeable data. Miri no longer complains with this change made.

However, this does seem like at-best an abuse of MaybeUninit outside of its intended purpose. Is using MaybeUninit as a sort of ReadOnlyUnsafeCell just hiding our unsoundness from Miri, or does this make sense as a potential solution to the problem described in this issue?

comex commented 2 years ago

One suggested alternative is to transmute the initialized memory we recieve to MaybeUninit<T> instead, since MaybeUninit<T> does not allow the compiler to assume multiple reads of the underlying type will be fixed,

This characterization in terms of not being "fixed" is not accurate. Your link mentioned this comes from the documentation for MaybeUninit, but that documentation is also wrong.

Read this: "What The Hardware Does" is not What Your Program Does: Uninitialized Memory

Strictly speaking, it's not that the bytes can change values when you read them multiple times. On the contrary, from the abstract machine's perspective, reading a byte of uninitialized memory always gives you the same value. It's just that that byte value is not a number from 0 to 255, but instead a special value called "uninitialized". (From LLVM's perspective, this is represented by either undef or poison.) The blog post explains this in more detail.

The only effect of MaybeUninit is that it's not instant UB for it to contain uninitialized bytes. For most types, you get UB as soon as you create a value of that type containing uninitialized bytes, where by "create" I mean, e.g., performing a load of uninitialized memory with that type, or calling assume_init() on a MaybeUninit that isn't actually initialized, etc. In this case, though, your data doesn't(?) actually contain uninitialized bytes, so using MaybeUninit or not shouldn't make any difference.

However, I don't know why Miri treats it differently.

jettr commented 2 years ago

Thank you for the guidance; it is really helpful!

Our miri issue that we are trying to fix is related to transmuting &[u8] into &[Cell<u8>] and miri tells us that we have introduce the possibility of interior mutability where there should be none. The fix is to instead transmute &[u8] into &[MaybeUninit<u8>], which does seem to correctly fix the issue miri is reporting (not just making it difficult for miri to detect).

The question now though is if we are breaking aliasing rules by using MaybeUninit instead of Cell. We also can create a &[MaybeUninit<u8>] or &[Cell<u8>] from a raw pointer of initialized memory that is somewhat outside of our control. The complication comes from the fact that the data in the underlying buffer can be changed externally while we have a immutable reference to it. Currently we are relying on Cell to provide a new read every time we access an element in the buffer, and can we instead rely on MaybeUninit::as_ptr().read() to provide the same level of guarantees?

The data that MaybeUninit holds is never actually uninitialized in our case.

RalfJung commented 2 years ago

The question now though is if we are breaking aliasing rules by using MaybeUninit instead of Cell

Yes, you are. &MaybeUninit is read-only like all shared references without an UnsafeCell.

I don't know why Miri no longer complains after that change; without a small example there are just too many variables here. But adding a MaybeUninit does not help at all when it comes to shared mutation.

Lokathor commented 2 years ago

Cell does not ensure that each read is a fresh read. For that you'd need some sort of volatile based abstraction.

RalfJung commented 2 years ago

When fixing this we have to be careful not to accept this program -- it segfaults at runtime, so we better make sure it does have UB.

RalfJung commented 1 year ago

Tree Borrows fixes this issue.

oconnor663 commented 1 year ago

Very exciting!