tock / tock

A secure embedded operating system for microcontrollers
https://www.tockos.org
Other
5.39k stars 691 forks source link

Unsoundness transmuting &[u8] to &[Cell<u8>] when handling read-only memory #2882

Closed kupiakos closed 1 year ago

kupiakos commented 2 years ago

kernel/src/processbuffer.rs defines impl<'a> From<&'a [u8]> for &'a ReadableProcesSlice which contains only a transmute. However, this is unsound because ReadableProcessByte, of which ReadableProcessSlice is a slice of, is Cell<u8>, enabling interior mutability. Transmuting from a shared read-only reference to a shared read-write reference is UB regardless of whether you write or not, because you can violate the constness invariant of other shared references that may exist at the time with safe code (other than the transmute).

If you add the following to the bottom and run cargo miri test, it will fail:

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_ro_slice_into_readable_process_slice() {
        let data = [10u8; 10];
        let out: &ReadableProcessSlice = (&data[..]).into();
    }
}

The failure:

test processbuffer::test::test_ro_slice_into_readable_process_slice ... error: Undefined Behavior: trying to reborrow for SharedRea
dWrite at alloc96722, but parent tag <241883> does not have an appropriate item in the borrow stack                                
   --> kernel/src/processbuffer.rs:620:18                                                                                          
    |                                                                                                                              
620 |         unsafe { core::mem::transmute(val) }                                                                                 
    |                  ^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc96722, but parent tag <241883> does
 not have an appropriate item in the borrow stack                                                                                  
    |                                                                                                                              
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still e
xperimental                                                                                                                        
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information     

Maybe instead of using interior mutability, we can somehow force the index operations to cast to a raw pointer before read, disabling optimizations? Another fix would be to change all use of &[u8] to &[Cell<u8>] directly.

I'd also like to suggest that all uses of unsafe require unit tests running under Miri, regardless of a reviewer scan for soundness.

jettr commented 2 years ago

This miri issue appears to be more related to casting dynamically sized types than it does with casting u8 to Cell<u8>. The way that the process buffer code is written, it does not expose a way for the underlying Cell<u8> to be written when contained within in ReadableProcessSlice so this interface does not expose a way for safe code to invalidate the invariant that the original &[u8] cannot change.

I say this because I changed the unit test to cast a &u8 to &Cell<u8> and even &[u8; 2] to &[Cell<u8>; 2] and miri does not complain anymore. It is only when casting dynamically sized types, do we run into issue reported by miri. Probably because

Currently the only properly supported way to create a custom DST is by making your type generic and performing an unsizing coercion

from above link

jettr commented 2 years ago

This following unit test does not complain about any miri unsoundness:

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_convert() {
        let array: [u8; 10] = [1; 10];
        let cell_array: [Cell<u8>; 10] = unsafe { core::mem::transmute(array) };
        let process_slice: &ReadableProcessSlice = unsafe { core::mem::transmute(&cell_array[..]) };

        assert_eq!(process_slice[0].get(), 1);
    }
}

This transmutes non dynamically sized types (DST) first (i.e. arrays), then converts the DST, namely &[Cell<u8>] to &ReadableProcessSlice

kupiakos commented 2 years ago

@jettr Your test is incorrect; you're creating a copy of the array when you do the transmute. Change it to this and it will complain about unsoundness:

let array: [u8; 10] = [1; 10];
let _cell_array: &[Cell<u8>; 10] = unsafe { core::mem::transmute(&array) };

Transmuting &T to &Cell<T> is UB, for any T: ?Sized.

kupiakos commented 2 years ago

Actually, there is a small exception to that: when T is a ZST, Miri doesn't complain about aliasing for obvious reasons.

jettr commented 2 years ago

Ah, thank you. I missed that. This means that we just really need to make ReadableProcessByte's cell field a real u8 instead of a Cell<u8>. We are actually opening the door for more UB by using Cell<u8> instead of u8. This is making more sense now.

If we update only the following lines, then your original miri example no longer complains:

#[repr(transparent)]
pub struct ReadableProcessByte {
    cell: u8,
}

impl ReadableProcessByte {
    #[inline]
    pub fn get(&self) -> u8 {
        self.cell
    }
}

And if it weren't a major API change, I would suggest just replacing ReadableProcessByte with u8 everywhere in the processbuffer.rs file, but that is probably too large or an API change right now.

kupiakos commented 2 years ago

We are actually opening the door for more UB by using Cell<u8> instead of u8

Essentially. You can likely get similar optimizer behavior if your indexing function uses raw pointer functions, though I'd want to go over that with a fine-tooth comb.

lschuermann commented 2 years ago

Thanks @kupiakos and @jettr. Sorry for responding this late. I wanted to look into the issue in depth first, because this is a complex interface and we already did some iterations on it during development.

This means that we just really need to make ReadableProcessByte's cell field a real u8 instead of a Cell<u8>. We are actually opening the door for more UB by using Cell<u8> instead of u8. This is making more sense now.

I don't think that the proposed solution is going to work. Making ReadableProcessByte a wrapper around Cell<u8> over u8 was a deliberate decision, because a ReadableProcessByte can be created either from a ReadableProcessBuffer or a WriteableProcessBuffer, and with the changes of #2797 even from regular [u8] slices. Now, the reason we use Cell<u8> is that we can have interior mutability over potentially aliased memory. Userspace processes are allowed to give the kernel access to overlapping memory regions as part of different allows. If we were to change ReadableProcessByte to be a u8, we could run into the following issue:

  1. Process shares a memory region mutably to allow slot A (represented as a ReadWriteProcessBuffer).
  2. Process shares the same memory region immutably to allow slot B (represented as a ReadOnlyProcessBuffer).
  3. Capsule accesses memory in the allow A through WriteableProcessBuffer::mut_enter and subslices into a &Cell<u8> reference (a).
  4. Capsule accesses memory in the allow B through ReadableProcessBuffer::enter and subslices into a &u8 reference (b).
  5. Capsule uses the &Cell<u8> reference a to change the memory contents behind the reference b of type &u8.
  6. ~Profit.~ ...and we've got ourselves unsound behavior. :)

The proposed solution would only work for the case where the memory cannot be an alias of some other ProcessByte or ProcessSlice, namely when stemming from a borrow of a &[u8] slice of kernel memory.

Now, I'm still looking into potential solutions and whether this is an issue at all. We're currently limiting the API surface of Cell through ReadableProcessByte such that the inner values can't be mutated. I'm not entirely sold on the fact that what we're doing here is wrong, but I do admit that it's not particularly elegant to begin with. Fundamentally we're not interested in the API which Cell offers us, but in the properties which wrapping a value inside of the UnsafeCell type gives us. To cite the Rust documentation on UnsafeCell:

If you have a reference &T, then normally in Rust the compiler performs optimizations based on the knowledge that &T points to immutable data. Mutating that data, for example through an alias or by transmuting an &T into an &mut T, is considered undefined behavior. UnsafeCell opts-out of the immutability guarantee for &T: a shared reference &UnsafeCell may point to data that is being mutated. This is called “interior mutability”.

Which is exactly what we want. Although I'm not entirely sure whether using UnsafeCell will solve the issue here. It would force us to use unsafe code to retrieve the values, which might in this context actually make sense. It'd mean that there is no "safe" way to accidentally mutate the values behind this reference. There appears to be no UnsafeCell equivalent for when you want to have interior mutability but only perform read accesses on this particular reference.

You can likely get similar optimizer behavior if your indexing function uses raw pointer functions

This is also slightly problematic given that our type representing this memory, either pointing to user-space or kernel-space, needs to be #[repr(transparent)] over a u8 or have the exact same size and alignment constraints. This is because we use it transparently through (slice-)references over the particular memory section. We can't simply introduce our own types containing information about the address and length of the memory region we're accessing, because that's fundamentally incompatible with the Index traits returning references.

jettr commented 2 years ago

Ah, thank you Leon for that in depth response/discussion! You bring up a good point why we can't use u8 directly since we would be breaking the aliasing rules. I tried using UnsafeCell instead of Cell but we get the same miri error. However, when I used MaybeUninit<u8>, I don't get the miri error and we aren't breaking any aliasing rules. I think using MaybeUninit<u8> gives us the best of both worlds. I have already updated #2884 to use MaybeUninit<u8>.

lschuermann commented 2 years ago

However, when I used MaybeUninit<u8>, I don't get the miri error and we aren't breaking any aliasing rules.

Do you have any reference stating that MaybeUninit allows aliasing of the underlying memory? Or how did you end up with MaybeUninit? I can't make the connection here. :smile:

jettr commented 2 years ago

Well MaybeUninit tells rust that the memory might not be initialized, so it can't perform many optimizations. Also we access it though unsafe raw pointer reads which won't get optimized away.

Moreover, uninitialized memory is special in that it does not have a fixed value (“fixed” meaning “it won’t change without being written to”). Reading the same uninitialized byte multiple times can give different results. This makes it undefined behavior to have uninitialized data in a variable even if that variable has an integer type, which otherwise can hold any fixed bit pattern

from MaybeUninit reference leads me to beleive that the rust compiler does not treat &[MaybeUninit<u8>] any where close to the same as &[u8] in terms optimization for alias rules. MaybeUninit memory is kind of off limits until it has been marked as initialized, which is pretty much the behavior we want.

lschuermann commented 2 years ago

@jettr Thanks, this reasoning makes sense. For me it's important to validate two things:

I'll try and ask some Rust folks for the first so that we have confirmation on this issue and aren't just exploiting some current optimization behavior, whereas I can offer to compare the ASM for the latter. Won't be today though. :smile:

kupiakos commented 2 years ago

I'm not convinced that MaybeUninit has the behavior you describe: the message there discusses that uninitialized memory invokes undefined behavior when read, as a warning on the behavior of assume_init, not that the compiler can't assume a &MaybeUninit<T> is read-only. Memory layout is orthogonal to aliasing concerns.

I've also got a more exotic method for solving: ZST slices. If you pass around references of slices to ZSTs, that drops all alias analysis since ZSTs can't overlap. Then you can expose your read methods as raw pointer operations. It should give extremely similar performance behavior as using raw pointers (what you actually want here), and won't make Miri complain. I've got a short example of what that might look like here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d28992a6a86e9d2f0c85e333b5fc0361.

lschuermann commented 2 years ago

@kupiakos The core mechanism you describe sounds like you have a slice of some zero sized types, use that as a base pointer, cast it to a u8 pointer, go to some offset, just to cast that into a reference to a ZST again. While I agree this should work in theory, it's really weird because it goes against the usual behavior for an array of ZSTs, which will cause the whole array to be zero-length regardless of the number of elements and have every element be at the base address. Thus for any index other than zero, by taking a pointer offset w.r.t. the u8 base type, you're transmuting a reference to a ZST out of thin air, where there logically wouldn't be one at that address. I'm aware that with ZSTs this doesn't always matter, but it's far from what I'd see as elegant. Checking with the rules on pointer safety it appears that this may be sound: "However, casting any non-zero integer literal to a pointer is valid for zero-sized accesses, even if some memory happens to exist at that address and gets deallocated". Other rules in this document might suggest otherwise.

I would see this as a last-resort type of option. It may work, but even if it is sound, it (ab?)uses references in a very unintuitive way which might be hard to grasp for anyone dealing with this code. It's also significantly more complex to implement (at least compared to the current, simple design) and not as trivial to verify as correct.

kupiakos commented 2 years ago

@lschuermann there is a reason I said it was exotic 😉 Fortunately, with your current API, all of the complexity should be limited to the one file. I'll send a PR so you can see what it looks like in practice at least.

I honestly don't see any good option here. It is always UB to go from &u8 to &Cell<u8>, and that's what you need to do right now, and MaybeUninit won't give you the property you're looking for. There are only three ways I know how to prevent Rust from assuming shared memory contents are stable across reads:

You can't use interior mutability, because you need to accept a &[u8]. So, you need to use raw pointers somehow. You could pass around a *const ReadableProcessByte, but raw pointers are awful to work with. Since you're always using shared mutability, you could pass around a ReadableProcessSliceRef that contains the raw pointer instead of a &ReadableProcessSlice, but that would break Index.

There's also another option: don't allow constructing a &ReadableProcessSlice from a &[u8]. That would probably not be worth it for the trouble it causes.

lschuermann commented 2 years ago

There's also another option: don't allow constructing a &ReadableProcessSlice from a &[u8]. That would probably not be worth it for the trouble it causes.

If this would properly solve the issue at hand, I wouldn't be too hesitant of this option. AFAIK #2797 has not seen any use in upstream and was introduced because there is a need downstream. As discussed on the PR itself, creating a &ReadableProcessSlice from a &[u8] also has a somewhat limited use in Tock's asynchronous kernel. It might be easier to instead use a composite type such as the following (naming TBD)

enum ReadableTockSlice<'a> {
    Kernel(&'a [u8]),
    Process(&'a ReadableProcessSlice),
}

and provide an appropriate interface on top of that.

I'm not sure though whether this fundamentally solves the underlying issue though? While we're -- as long as we're talking about process memory -- in part outside of the bounds of what the borrow checker can reason about, we're still creating a Cell (UnsafeCell) over potentially hardware-enforced read-only flash memory. As far as I know, these machine specifics can't be directly mapped to high-level concepts in Rust and are covered by the general rule of "everything which may invoke undefined behavior when used must be marked as unsafe". Given that there is no code path to actually try and mutate memory behind a ReadableProcessSlice the current implementation thus might be fine when concerned about process memory, but it's better to be certain.

jettr commented 2 years ago

I think this UB issue exists whether or not we allow the kernel to convert &[u8] into a ReadableProcessSlice or not. A rust application can and does pass &[u8] buffers to the kernel, and they are wrapped in ReadableProcessSlices. We are just able to put everything together here in one place by having the conversion function locally in the kernel.

Looking at MaybeUninit some more, the as_ptr method documentation does state "writing to memory that this pointer (non-transitively) points to is undefined behavior (except inside an UnsafeCell)." So I think MaybeUninit does not solve our problem here. It makes miri happy, but u8 also makes miri happy. I think MaybeUninit and u8 both suffer from the issue of having an immutable reference where the memory can change without the rust code's knowledge.

If we want to get rid of this miri issue and maintain soundness of immutable references, I think we need to drop down to raw pointers, which @kupiakos ZST solution does. I am sure there are a few pointer solutions we could come up with too.

@lschuermann I agree that the interface we are exposing does not permit write access to the read only memory, so it would be nice if miri could see that and know we aren't invoking incorrect behavior. On the other hand it feels wrong to not fix the miri issue. Is there a way to at a tag to make miri ignore that line somehow?

lschuermann commented 2 years ago

To summarize some of the things which we've discussed on today's call:

We agree that this is an issue we'd like to resolve somehow. @kupiakos's solution definitely seems like something which could resolve the issues, while still allowing kernel memory to be treated as a ProcessSlice. It is essentially equivalent to using raw pointer reads/writes, because it uses them underneath. However, because this solution is using some clever ways of manipulating pointers to create a reference to a zero sized type in a slice where there normally wouldn't be one, and which one probably cannot possibly synthesize using safe Rust alone, it's not trivial to verify it's correctness. We'd want to think about it for some time and hear more people's concerns.

It appears as if this indeed is only an issue when we're treating kernel memory, along with userspace-provided memory, as a ProcessSlice. The approach of using slices of Cells to avoid aliasing issues when accessing externally provided memory regions seems valid in general.

A valid approach to solve this issue might thus also be to represent read-only memory, which may either be process or kernel memory, as a composite type as described above. We would then revert the changes of #2797. This could look like:

enum ReadableTockSlice<'a> {
    Kernel(&'a [u8]),
    Process(&'a ReadableProcessSlice),
}

It would furthermore allow us to do something similar for mutable memory:

enum WriteableTockSlice<'a> {
    Kernel(&'a mut [u8]),
    Process(&'a WriteableProcessSlice),
}

This should cover @jettr's use cases. ~These constructs could expose the same API which ReadableProcessSlice and WriteableProcessSlice do today. If we are in favor of an interface-breaking change, we could even change ReadableProcessBuffer and WriteableProcessBuffer's enter and mut_enter to pass a reference to this composite type, to foster widespread use in the kernel.~ (they can't implement the same interface, subslicing with the Index trait wouldn't work)


Regarding usage of slices of Cells for process memory (a distinct issue):

It is, at best, slightly inelegant to represent memory which is potentially read-only by hardware constraints (flash) through a slice of Cells in the kernel. However, the type infrastructure here makes sure that code can never, without resorting to unsafe, attempt to mutate this memory. Miri might complain if it could reason the entire system's behavior, although as far as I know, such interactions with hardware constraints cannot always be properly modeled in Rust. Thus I would assume that transmuting the application-provided flash memory into a slice of Cells wrapped by types making methods to mutate values inaccessible is fine, and we're assuring the compiler of that through the unsafe transmute. Ultimately, due to the nature of Tock's allow family of system calls, we may end up with read-only allows overlapping with read-write allows and thus must protect against aliasing and expect the underlying memory to change.


I hope this is a good summary. @kupiakos, @jettr and anyone else, please correct me if I'm wrong! Thanks.

hudson-ayers commented 2 years ago

Leon, I think that is a good summary, I want to see if @kupiakos agrees.

However, the type infrastructure here makes sure that code can never, without resorting to unsafe, attempt to mutate this memory.

I think this is not entirely true -- instead, the type infrastructure here makes sure that no code outside of kernel/src/process_buffer.rs can attempt to mutate this memory without using unsafe. It is possible to mutate the memory by adding safe code to process_buffer.rs, (e.g. I could add a set() method to ReadableProcessSlice without adding any additional unsafe, but I am not really worried about that.)

If your summary is accurate, it does seem like our implementation was sound until #2797 ?

lschuermann commented 2 years ago

@hudson-ayers I've taken the liberty to edit your comment such that your response isn't displayed as a quote. :sweat_smile:

I think this is not entirely true -- instead, the type infrastructure here makes sure that no code outside of kernel/src/process_buffer.rs can attempt to mutate this memory without using unsafe. It is possible to mutate the memory by adding safe code to process_buffer.rs, (e.g. I could add a set() method to ReadableProcessSlice without adding any additional unsafe, but I am not really worried about that.)

I agree, this is what I wanted to say. "the type infrastructure" as in stuff that's in processbuffer.rs. If we wanted to mitigate this further, using UnsafeCell instead of Cell might emphasize that arbitrary accesses to the contents of the types in processbuffer.rs can have dangerous consequences.

If your summary is accurate, it does seem like our implementation was sound until #2797 ?

I think so, yes.

kupiakos commented 2 years ago
enum ReadableTockSlice<'a> {
    Kernel(&'a [u8]),
    Process(&'a ReadableProcessSlice),
}

Because of the required change in API, I don't see a usability difference between this and ReadableTockSliceRef<'a>(*const [u8], PhantomData<&'a [u8]>), which is 1 usize smaller due to not needing a discriminant. This may end up being the best option altogether.

Miri might complain if it could reason the entire system's behavior, although as far as I know, such interactions with hardware constraints cannot always be properly modeled in Rust. Thus I would assume that transmuting the application-provided flash memory into a slice of Cells wrapped by types making methods to mutate values inaccessible is fine, and we're assuring the compiler of that through the unsafe transmute

The hardware constraint could be represented as it being a read-only type. If Miri were running only on the kernel and could FFI with userspace, it would see a read-only memory region being declared as interior-mutable since the kernel is what hands out that memory to userspace in the first place. References have invariants, raw pointers do not. However, as long as you make it impossible to do &[u8] -> &[Cell<u8>] with kernel memory, I don't see a way that rustc could tell with externally provided memory whether it's mutable or not. Just don't turn it into a &[u8] between receiving the allow data and creating the ReadableTockSlice.

using UnsafeCell instead of Cell might emphasize that arbitrary accesses to the contents of the types in processbuffer.rs can have dangerous consequences

I generally agree,UnsafeCell would better imply the precariousness of writing to read-only memory and require any future editors to use unsafe to add mutating code, getting more eyes on it.

If your summary is accurate, it does seem like our implementation was sound until #2797 ?

Agreed.

jettr commented 2 years ago

I am still having trouble understanding why this is only a problem for the kernel. If we have an application that shares a reference to a const DATA: [u8;10] = [0u8; 10] buffer with the kernel, and the kernel wraps that with a &[Cell<u8>], how is that not the same issue?

hudson-ayers commented 2 years ago

I think the fundamental difference is that in the kernel case, the compiler used to compile the kernel knows that the chunk of memory must be read only, and sees you going from &[u8] to &Cell<u8>, which is UB. In the case where the application provides a pointer to memory that happens to be in flash or stored as a const in the app, the kernel compiler already can't make assumptions about that memory location (because applications are compiled separately from the kernel)

jettr commented 2 years ago

Can we not cast the kernel's &[u8] reference to a pointer to "wash away knowledge" then turn it into a ProcessBufferSlice similar to the application flow? If that doesn't work how would that be different than app case.

hudson-ayers commented 2 years ago

I think the reason that it is different is something along the lines of "pointers have provenance", and even if you cast a reference to a pointer, the compiler is allowed to make assumptions based off the fact that it knows where that pointer came from, so a pointer cast alone is insufficient to "wash away knowledge" in the way you describe. There are probably other people better equipped to fully answer this than me, but the blog post that gave me at least some intuition along these lines is ralf jung's "provenance" post: https://www.ralfj.de/blog/2020/12/14/provenance.html , which I think is really excellent

kupiakos commented 2 years ago

why this is only a problem for the kernel

It's only detectable in the kernel 😉

If we have an application that shares a reference to a const DATA: [u8;10] = [0u8; 10] buffer with the kernel, and the kernel wraps that with a &[Cell<u8>], how is that not the same issue?

It is the same issue, and I would call it UB from the perspective of Rust's memory model. But rustc generally doesn't insert arbitrary writes to memory, and so it would probably be an invisible form of UB. @hudson-ayers is correct about the kernel compiler not having the necessary information to do bad things with userspace RO memory being cast to &[Cell<u8>] as long as you don't do any writes.

The key thing here is that creating a reference inherently makes assertions about the referenced memory. & asserts that memory is shareable. &mut asserts that memory is unique (and therefore writable), Cell asserts that memory is writable even when shared.

The kernel needs to treat memory as possibly mutating, but without asserting that the memory is read-write. The only way I know how to do that is with raw pointers, which make no assertions about referenced memory until you read/write.

Can we not cast the kernel's &[u8] reference to a pointer to "wash away knowledge" then turn it into a ProcessBufferSlice similar to the application flow? If that doesn't work how would that be different than app case.

The knowledge is not "washed away". Miri does non-local aliasing analysis to be able to work with raw pointers at all. Creating the &[Cell<u8>] still declares that some memory is writable. The compiler might also have info to determine that the raw pointer is derived from a &[u8], and so inlining would trivially cause UB:

use core::cell::Cell;

// unbounded lifetimes yay
unsafe fn do_cast<'a>(x: *const [u8]) -> &'a [Cell<u8>] {
    &*(x as *const _)
}

fn main() {
    let data: [u8; 10] = [10; 10];
    let d1 = &data[..];
    let d2 = unsafe { do_cast(d1) };
    d2[0].set(20);

    println!("{}", d1[0]);
    println!("{}", d2[0].get());
}
jettr commented 2 years ago

It's only detectable in the kernel ... [The App version] is the same issue, and I would call it UB from the perspective of Rust's memory model

It sound like the solution shouldn't be to just remove the &[u8] cast into ReadOnlyProcessSilce since the other application path of sharing a &[u8] buffer from an app into the kernel still exists. Having the cast just exposed an existing issue that was previously undetectable. I don't see them as two different issues even though miri can only easily detect the issue in one of the scenarios.

hudson-ayers commented 2 years ago

It sound like the solution shouldn't be to just remove the &[u8] cast into ReadOnlyProcessSilce since the other application path of sharing a &[u8] buffer from an app into the kernel still exists. Having the cast just exposed an existing issue that was previously undetectable. I don't see them as two different issues even though miri can only easily detect the issue in one of the scenarios.

I don't believe that the cast just exposed an existing issue that was previously undetectable. I think the in-kernel cast is UB and the from-application cast is not. For the same reason that miri can only detect the issue in one of the scenarios, the Rust compiler can only detect UB in that scenario.

Basically: in the scenario where the buffer is shared from an app, the compiler for the rust kernel does not know anything about where it came from, and cannot detect the possibility of UB. Thus, the rust compiler must treat the memory as though it is in fact valid memory to mutate, which is not a problem since we never actually mutate it and the rust compiler will not create arbitrary writes where none exist. In the scenario where the buffer is from in the kernel, the compiler can see memory that it knows to be read-only being cast into a mutable type, which the rust compiler knows to be undefined behavior. Rust is allowed to assume undefined behavior is impossible, and thus allowed to do things like optimize away all code which would lead to undefined behavior, so the Rust compiler in the second case could choose to insert a panic before any access to that buffer, or return all 0s any time that buffer is read, or literally anything else (including behaving as we expect, which seems to be what happens now).

I am happy to be corrected on the above, just trying to voice how my understanding currently differs from Jett's.

Regardless, I agree that it is important that there be some ergonomic way to use in-kernel u8 slices and ReadOnlyProcessSlice interchangeably, and any change we make should continue to support that use case.

jettr commented 2 years ago

I don't know how much it is worth going back and forth on this because hopefully we can address the real kernel issue and the potential phantom app issue with the same change.

For the same reason that miri can only detect the issue in one of the scenarios, the Rust compiler can only detect UB in that scenario.

If the process buffer code actually did call Cell.set, that would definitely cause UB in the rust application, yet miri would not be able to detect it. On the other hand, miri would be able to detect that UB if that were all kernel code.

It seems likes we are going to just have to agree to disagree, which is okay. I wish there were some way to prove it or point to something that was definitive here. I think we are all interpolating from what we know or think we know :) I definitely can't say my view is 100% correct here. I think there is potential for UB in application case and we might as well fix it, but there is definitely a chance that there is not technically UB in the application case.

lschuermann commented 2 years ago

If the process buffer code actually did call Cell.set, that would definitely cause UB in the rust application

Without trying to go back and forth on this too much, one remark: we're not really concerned about UB in userspace applications through memory writes here. While it is true that the kernel wants to make guarantees to userspace and we want to uphold these, this isn't transitively coupled to soundness of the kernel: whereas a sound kernel is a precondition for making guarantees to userspace applications, the kernel can very well cause issues in userspace while being sound (but incorrect).

@jettr and I discussed this out of band yesterday, and to summarize there seem to be three primary concerns:

That being said, I'm going to implement both @kupiakos approach and the proposed enum solution, to see what the API would look like. Once we have those two options, I'd vouch for comparing them and asking people in the Rust Zulip on whether they can give us guidance on whether the ZST handling is sound.

kupiakos commented 2 years ago

@lschuermann I'm working on a safe abstraction over the ZST concept (I'm calling them "existential references") and getting it reviewed for initial soundness and release on crates.io.

I think raw pointers should be strongly preferred over the enum solution, which adds unnecessary runtime checks.

jettr commented 2 years ago

I want to come back to https://github.com/tock/tock/pull/2884 as a potential solution.

The documentation for MaybeUninit states

Moreover, uninitialized memory is special in that it does not have a fixed value (“fixed” meaning “it won’t change without being written to”). Reading the same uninitialized byte multiple times can give different results.

Which really does mean to me that the compiler cannot rely on multiple reads being consistent or "fixed". This is what we want to replace Cell concept of having user space update the memory without the kernel knowing about it, but it does not allow writing to the memory location without unsafe keyword (or having a mutable MaybeUninit reference -- which we would not have). It gives us the non-cachable read access we want, while still reserving the right amount of space (u8) for the representation.

Specifically, we want to use MaybeUninit for read-only process buffer, and continue to use Cell for read-write process buffers. It would still be invalid/UB to modify the contents of MaybeUninit from the as_ptr() method for the read-write case.

hudson-ayers commented 2 years ago

I can't think of a specific reason that #2884 is unsound, but it does feel like an abuse of what MaybeUninit is meant to be used for, which makes me unesasy. I cannot find any examples anywhere of transmuting an initialized type to a MaybeUninit.

Also, MaybeUninit provides methods to write the underlying value without invoking unsafe -- though we are not calling those methods or making them accessible, it still suggests to me that MaybeUninit is not intended to be used as a wrapper around non-writable memory.

I am curious what others think though.

jettr commented 2 years ago

The methods to mutate underlying data via MaybeUninit require a &mut, which we will not have.

It even calls out in the immutable as_ptr method that

Writing to memory that this pointer (non-transitively) points to is undefined behavior (except inside an UnsafeCell).

which makes sense and would be UB. If wanted to mutate data, we would have to call as_mut_ptr or assume_init_mut which require &mut. So I don't actually think we can mutate data without calling unsafe (or maybe I am missing a function).

hudson-ayers commented 2 years ago

The methods to mutate underlying data via MaybeUninit require a &mut, which we will not have.

Ah, right, that makes sense. Still, my uneasiness remains, so I went ahead and posted a question here https://github.com/rust-lang/unsafe-code-guidelines/issues/303 in the hopes of getting some validation

hudson-ayers commented 2 years ago

https://github.com/rust-lang/unsafe-code-guidelines/issues/303#issuecomment-1016099522 suggests that using MaybeUninit should not aid any in terms of correctness over using just plain u8 (and that the docs for MaybeUninit are also incorrect 👎 )

jettr commented 2 years ago

Thank you for posting and knowing where to post to get a good answer! I posted a follow up, but it sounds like the MaybeUninit solution is dead. Depending on the answer I get back, I will just abandon the PR

jettr commented 2 years ago

Looks like maybe using Cell doesn't work like we thought as well; I also could have over simplified the problem when I described in on that discussion though.

RalfJung commented 1 year ago

(and that the docs for MaybeUninit are also incorrect 👎 )

I am not aware of the docs being incorrect -- as far as I can tell, they are completely correct. If you think they are incorrect, please file an issue in the Rust repo and Cc me.

lschuermann commented 1 year ago

I am not aware of the docs being incorrect -- as far as I can tell, they are completely correct.

@RalfJung I just re-read the docs for MaybeUninit, and it seems correct to me as well. I can't piece together what https://github.com/rust-lang/unsafe-code-guidelines/issues/303#issuecomment-1016099522 was referring to when it mentioned incorrect documentation, but this was over a year ago -- maybe things have changed since then.

In either case, we're not using MaybeUninit in practice to access "foreign" process memory, and our current solution (transmuting &[u8] to &[Cell<u8>] while not exposing any operations which would be able to mutate this memory) appears to be sound under the new tree borrows model (https://github.com/rust-lang/unsafe-code-guidelines/issues/303#issuecomment-1696984148), so I suppose we can close this issue. Thanks for clarifying this!

I suspect that the statement about MaybeUninit's documentation was regarding the initialization invariant, in particular:

This makes it undefined behavior to have uninitialized data in a variable even if that variable has an integer type, which otherwise can hold any fixed bit pattern.

It is unclear to me how this would interact with foreign process memory, not contained in a Rust allocation. This memory is "initialized" in the sense that a process writes to it, but this is not necessarily visible to Rust / LLVM. I suspect that this invariant just means that MaybeUninit is unfit for such memory, and we'll have to use other solutions for interacting with such memory (such as raw pointer operations or Cell references for memory which once was initialized by Rust, e.g. before executing a process).

I feel like we can close this particular issue thanks to https://github.com/rust-lang/unsafe-code-guidelines/issues/303#issuecomment-1696984148.

RalfJung commented 1 year ago

Ah, it was probably a misinterpretation of some way of saying that the uninitialized values are "not stable" and looking at the same one multiple times can produce different results. It is possible we hard wording of that sort in the docs and it got removed because it was found too confusing.

It is unclear to me how this would interact with foreign process memory, not contained in a Rust allocation. T

A priori that memory works just like Rust memory, so using MaybeUninit is totally fine. But if the memory is altered by other processes you need to use atomics (and those have an UnsafeCell inside them), that's the same as having multiple Rust threads work on shared mutable memory locations.

lschuermann commented 1 year ago

But if the memory is altered by other processes you need to use atomics (and those have an UnsafeCell inside them), that's the same as having multiple Rust threads work on shared mutable memory locations.

This makes sense. Most of the platforms Tock runs on don't have support for atomics, and the kernel and userspace work in an interleaved fashion. Across every context switch we have an (implicit) compiler fence through an inline-assembly block that does not have the nomem clobber. Hopefully that should be enough for those platforms. It makes sense that when the kernel and userspace can run concurrently, we'd need to use atomics.

RalfJung commented 1 year ago

Does the platform have preemptive concurrency? The moment there is preemption, you need atomics even if there is only a single thread ever running at a given time, since compiler optimizations can otherwise break program behavior. These atomics can later compile to regular memory accesses, but the compiler needs to treat them specially during optimizations.

If concurrency is cooperative and all yield points are sufficiently opaque to the compiler to be a full fence (like a compiler_fence(SeqCst)), that should work, but know that you are doing a delicate dance on the edge of soundness here.

ppannuto commented 1 year ago

Does the platform have preemptive concurrency?

Only in top-half interrupt handlers, which are mostly if not completely written in assembly, e.g.: https://github.com/tock/tock/blob/master/arch/cortex-m/src/lib.rs

Userspace apps are preemptable, but that's orthogonal to the kernel implementation. The single-threaded execution model is an intentional design decision, largely to ameliorate such concerns.

lschuermann commented 1 year ago

@RalfJung This is a good point. The kernel (what we're concerned about in this issue) does not have "uncontrolled" preemptive concurrency: every context switch is well-defined through an asm! macro which will, hopefully, generate such a fence. Switching back from userspace to the kernel continues execution in this assembly. To Rust, execution of userspace is represented as a single invocation of an asm! block, no Rust code will be invoked until this block has finished executing.

We do have limited preemption in the form of interrupt handlers, which execute Rust code to disable the respective interrupt sources, while the system is operating in kernel mode. In theory, such code should only use volatile reads and writes to the underlying MMIO-exposed interrupt controller. However, we should really perform a more in-depth review of that code path, to ensure that we're not violating soundness.

lschuermann commented 1 year ago

These atomics can later compile to regular memory accesses, but the compiler needs to treat them specially during optimizations.

Would you happen to know how we can mark certain accesses as atomic, without relying on the fundamentally unstable core::intrinsics methods, for platforms that have no atomic load/store support? We used to employ atomics implemented through intrinsics in one (now revised) part of Tock:

https://github.com/tock/tock/blob/b9f17882ea848c729e0b19e8acb5b0edf1fc15f7/kernel/src/deferred_call.rs#L15-L43

RalfJung commented 1 year ago

Would you happen to know how we can mark certain accesses as atomic, without relying on the fundamentally unstable core::intrinsics methods, for platforms that fundamentally have no atomic load/store support?

The target has assembly operations that perform loads/stores in a single atomic (non-preemptible) step, doesn't it? So in that sense the platform does have atomic loads/stores?

Ideally you could just use the regular intrinsics, and LLVM would (a) recognize them as atomics for the purpose of optimizations and (b) translate them to suitable assembly operations. Alternatively you can use inline assembly to "emulate" your own atomics.

lschuermann commented 1 year ago

The target has assembly operations that perform loads/stores in a single atomic (non-preemptible) step, doesn't it? So in that sense the platform does have atomic loads/stores?

Right. I suppose in this sense, and in particular in our context, the language in the documentation of the core::sync::atomic::Atomic* types is slightly confusing:

Note: This type is only available on platforms that support atomic loads and stores of u8.

All of our targets (Cortex-M and RISC-V) have loads/stores which are logically executed in a single non-preemptible step from the perspective of a single execution unit, but which are not necessarily atomic to other concurrent cores or hardware-threads. Specifically, the following compile-time assertion fails:

const _: () = assert!(cfg!(target_has_atomic) == true);

And as such, types like AtomicU8 are not available. This is not an issue for us, as all of our currently supported platforms either only have, or only make use of a single core / hardware-thread. The idea to use inline assembly is a clever workaround, thanks for that tip! We're trying to avoid using the core_intrinsics feature in our efforts to eventually compile on a stable toolchain.

RalfJung commented 1 year ago

Yeah I guess target_has_atomic is false because of the problem with other cores. But if you know there are no other cores then using inline assembly is a totally legit way to say "I want the behavior of this hardware instruction" and as far as the Rust compiler optimizations are concerned, this might be an actual atomic operation. This is exactly what inline assembly is for! And yes it's definitely cleaner than using core_intrinsics, that feature is super-unstable. The only potential downside is that real atomics will get better optimized than inline assembly.