theseus-os / Theseus

Theseus is a modern OS written from scratch in Rust that explores 𝐢𝐧𝐭𝐫𝐚𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐝𝐞𝐬𝐢𝐠𝐧: closing the semantic gap between compiler and hardware by maximally leveraging the power of language safety and affine types. Theseus aims to shift OS responsibilities like resource management into the compiler.
https://www.theseus-os.com/
MIT License
2.87k stars 172 forks source link

`MappedPages` is unsound when the backing memory is MMIO or DMA regions #1107

Open tatetian opened 1 week ago

tatetian commented 1 week ago

I think that the safe API provided by MappedPages is unsound when it is backed MMIO or DMA memory.

It is the programmer’s responsibility when writing unsafe code to ensure that any safe code interacting with the unsafe code cannot trigger these behaviors. unsafe code that satisfies this property for any safe client is called sound; if unsafe code can be misused by safe code to exhibit undefined behavior, it is unsound. --The Rust Reference

Background

Throughout the entire codebase, the driver code interacting with MMIO or DMA regions follows a programming pattern of two steps:

The first step is to create a MappedPages object to represent the corresponding MMIO or DMA memory region.

let mapped_pages = map_frame_range(phys_addr, size, MMIO_FLAGS)?;

The second step is to create a "proxy" object from the MappedPages that allows borrowing a reference of type T, where T represents the data format that can be used by a driver to exchange information with its target device.

One common type of such a proxy object is BorrowedMappedPages<T, M, _>, which grants you access to &T (or &mut T depending on M).

https://github.com/theseus-os/Theseus/blob/ee7688fa5c327e7e61652da5e3faa0eb6e2c4460/kernel/gic/src/gic/mod.rs#L218-L220

Another very useful type is BorrowedSliceMappedPages, which grants you access to &[T] (or &mut [T]).

https://github.com/theseus-os/Theseus/blob/ee7688fa5c327e7e61652da5e3faa0eb6e2c4460/kernel/nic_initialization/src/lib.rs#L59-L60

Sometimes, all you want is raw bytes; in such cases, borrowing &[u8] (or &mut [u8]) is sufficient.

https://github.com/theseus-os/Theseus/blob/ee7688fa5c327e7e61652da5e3faa0eb6e2c4460/kernel/nic_initialization/src/lib.rs#L31-L32

where ReceiveBuffer implements Deref<Target = [u8]> and DerefMut<Target = [u8]>.

Problem

The programming pattern revolving around MappedPages looks fantastic because the resulting code is entirely in safe Rust. Fundamentally, this programming pattern is enabled by MappedPages's ability to reinterpret the backing memory region as one or multiple values of T: FromBytes.

impl MappedPages {
    pub fn as_type<T: FromBytes>(&self, byte_offset: usize) -> Result<&T, &'static str> { ... }

    pub fn as_type_mut<T: FromBytes>(&mut self, byte_offset: usize) -> Result<&mut T, &'static str> { ... }

    pub fn as_slice<T: FromBytes>(&self, byte_offset: usize, length: usize) -> Result<&[T], &'static str> { ... }

    pub fn as_slice_mut<T: FromBytes>(&mut self, byte_offset: usize, length: usize) -> Result<&mut [T], &'static str> { ... }
}

But unfortunately, the safe API of MappedPages is unsound. That is, safe client of MappedPages may trigger undefined behaviors (UBs). This is because it fails to take into considerations the situations when the backing memory is externally modifiable.

let byte_ref: &u8 = mapped_pages.as_type(0)?;

Even for the simplest type u8, the above usage of the API may trigger undefined behaviors at run rime. This is because the semantics of a reference to u8 means the value of u8 is guaranteed to be unchanged so long as the reference is being held by Rust code. But when a mapped_pages refers to a MMIO or DMA region, the device may modify any values within the mapped_pages without the CPU side being aware of, thus breaking Rust's fundamental assumption.

Most Theseus kernel drivers suffer from this soundness problem as they use MappedPages to access MMIO or DMA regions, as shown by the three pieces of driver-related code. Although the API of MappedPages is unsound, the driver code that relies on MappedPages is probably ok because when it needs to access integral values such as u8, it gets a reference to Volatile<u8> instead of u8. The case of ReceiveBuffer is more disturbing as it indeed can create a reference to [u8] out of MMIO or DMA regions.

In conclusion, I believe that the API of MappedPages is unsound, although the memory safety of driver code is mostly ok.