imxrt-rs / imxrt-hal

Rust for NXP i.MX RT
Apache License 2.0
122 stars 29 forks source link

DMA soundness issue #137

Closed Finomnis closed 8 months ago

Finomnis commented 1 year ago

Lpspi::dma_write takes a &mut Channel and returns a Write object whose lifetime is attached to &mut Channel and the &[u32] input buffer. This causes the channel to be blocked until the write is finished or cancelled, and the source buffers valid. In theory.

There is one detail, however: core::mem::forget is not unsafe because Rust’s safety guarantees do not include a guarantee that destructors will always run. That means, by using forget to bypass the drop() function of Write, we can drop the input buffer while the DMA is still reading. Thus, through purely safe means we can produce a situation where the DMA reads from uninitialized memory. In the case of dma_read (same problem), it is even possible to write into uninitialized memory.

This behavior is called unsound.

Is this a deliberate decision of the hal maintainers or is this an oversight?

Finomnis commented 1 year ago

One way I've seen this being fixed is by taking ownership, like in rp2040-hal. Here, the Transfer object (the pendant to our Write object) only returns the Channel object back to the user through its wait function; which in our case would be the Future::Output.

mciantyre commented 1 year ago

Thanks for bringing this up. I made this decision; I'm hoping to learn more about your concerns so we can adapt this API. But first, here's my thinking.

The Pin drop guarantee lets us understand that the safe call to core::mem::forget() does not happen on any pinned object. Specifically,

[...] for pinned data you have to maintain the invariant that its memory will not get invalidated or repurposed from the moment it gets pinned until when drop is called.

By calling core::mem::forget() on the pinned Write (or Read) object, a user violates the drop guarantee.

In isolation, creating the Write object through Lpspi::dma_write does not initiate any DMA transfer. Instead, the DMA transfer starts the first time the object is poll()ed. In order to call poll(), a user must pin the object.

Notice how the Write object is !Unpin. This marker trait forces the user to call Pin::new_unchecked() to pin the object.

Pin::new_unchecked() is unsafe. Its safety documentation reiterates the importance of the drop guarantee. The included example also shows how a safe call to mem::swap() has the potential to cause undefined behavior. A call to core::mem::forget() is no different.

In the end, we ensure that someone, somewhere, has called an unsafe API before they've started a DMA transfer. I agree that forgetting the pinned, polled future without the call to drop is a problem. But the contract of that unsafe API is clear that this could happen.

Could you help me understand why the drop guarantee isn't sufficient? I'd love more information on why the rp2040-hal took a different approach. I'm eager to make the change in this HAL if I'm misunderstanding the drop guarantee.

Finomnis commented 1 year ago

Let me think about this! This is a lot of information, and there is a good chance that I'm wrong and learn something in the process.

mciantyre commented 1 year ago

Based on the discussions I've seen around DMA and embedded Rust, I think my position is unconventional. So there's a greater chance that I need to change my ways. Thanks for considering all this!

Finomnis commented 1 year ago

Some generic thoughts about a potential rework:

Finomnis commented 1 year ago

@mciantyre The following code compiles and is undefined behavior:

        let data: [u32; 5] = [1, 2, 3, 4, 5];
        {
            let dma_transfer = spi.dma_write(&mut spi_dma, &data).unwrap();
            cassette::pin_mut!(dma_transfer);
            let mut cm = cassette::Cassette::new(dma_transfer);
            cm.poll_on();
            core::mem::forget(cm);
        }
        drop(data);

Which means that either in imxrt-hal or cassette there is a soundness problem.

pin_mut!() is taken from https://docs.rs/pin-utils/0.1.0/src/pin_utils/stack_pin.rs.html#14. And does seem sound to me.

mciantyre commented 1 year ago
let dma_transfer = spi.dma_write(&mut spi_dma, &data).unwrap(); // 1
cassette::pin_mut!(dma_transfer); // 2

At 1, we construct the Write object. There's no DMA transfer in flight, so we could forget() the object here. At 2, we turn the Write object called dma_transfer into a Pin<&mut Write> object called dma_transfer; my repetition is deliberate.

There's nuance in pin_mut! that we need to consider. Variations of pin_mut!() are designed to shadow the name they're called with. The importance of shadowing is covered in the Pinning chapter of the Async programming guide.

By shadowing the Write object at line 2, we've ensured that we will call drop() on the Write object returned by dma_write. I'll demonstrate this later.

let mut cm = cassette::Cassette::new(dma_transfer); // 3
cm.poll_on(); // 4
core::mem::forget(cm); // 5

The call at 3 requires an unpin-able future. We have that in the form of Pin<&mut Write>. This type checks.

It seems the concern is that 4 starts the DMA transfer, and then 5 prevents drop from running. Is that correct? We've already guaranteed that we will call drop() on the Write object. So forgetting the Cassette object is fine, and 5 is sound.

To show that drop is called on the Write object, I took the example and compiled it for a Teensy 4; see here. To show how pin_mut! shadows its argument, I added additional code to type check the example. I then compiled and inspected the generated assembly. The important assembly snippet is below.

ARM assembly demonstrating call to drop for the Write object ```asm f2: f000 f830 bl 0x156 ::new::h522c056dbaa80667> @ imm = #96 f6: 9808 ldr r0, [sp, #32] f8: f000 f856 bl 0x1a8 ::poll_on::h59f73d76efb2134b> @ imm = #172 fc: f857 0c3c ldr r0, [r7, #-60] 100: f857 1c38 ldr r1, [r7, #-56] 104: f857 2c34 ldr r2, [r7, #-52] 108: f857 3c30 ldr r3, [r7, #-48] 10c: f847 3c20 str r3, [r7, #-32] 110: f847 2c24 str r2, [r7, #-36] 114: f847 1c28 str r1, [r7, #-40] 118: f847 0c2c str r0, [r7, #-44] 11c: f1a7 002c sub.w r0, r7, #44 120: f000 fa68 bl 0x5f4 @ imm = #1232 124: 9809 ldr r0, [sp, #36] 126: f000 fd0e bl 0xb46 ,imxrt_iomuxc::Pad<10 75806528_u32,1075807024_u32>,imxrt_iomuxc::Pad<1075806536_u32,1075807032_u32>,imxrt_iomuxc::Pad<1075806524_u32,1075807020_u32>>,4_u8>,u32>>::h5a594ae43a19c9e8> @ imm = #2588 12a: 9a0a ldr r2, [sp, #40] 12c: f1a7 001c sub.w r0, r7, #28 130: 4601 mov r1, r0 132: e892 5038 ldm.w r2, {r3, r4, r5, r12, lr} 136: e881 5038 stm.w r1, {r3, r4, r5, r12, lr} 13a: f000 fa5a bl 0x5f2 @ imm = #1204 13e: f248 403c movw r0, #33852 142: f2c2 0000 movt r0, #8192 146: f248 4264 movw r2, #33892 14a: f2c2 0200 movt r2, #8192 14e: 2128 movs r1, #40 150: f007 f831 bl 0x71b6 @ imm = #28770 154: defe trap ```

The program shows a drop_in_place call[^1] for a Write object, after the call to forget. This is consistent with a compiler-inserted drop() at the end of scope. Since drop() runs for the Write object, the DMA transfer is canceled.

This example meets the drop guarantee as of the pin_mut! usage. I encourage you to remove the pin_mut! to see if the example still compiles. If that removal were to compile, it would be concerning.

I don't see undefined behavior. Is there another opportunity for undefined behavior that I'm missing?

[^1]: I agree that true undefined behavior means the drop_in_place call could be a fluke. Understanding the rules around shadowing and scoping might help clarify why drop_in_place is present.

Finomnis commented 1 year ago

The more I think about it, the more I feel like you might have developed a novel way of implementing DMA drivers on embedded. I'd love an opinion from the author of https://docs.rust-embedded.org/embedonomicon/dma.html.

mciantyre commented 1 year ago

My only novel contribution is ~apathy towards~ a position against the conventional argument

[...] by using forget to bypass the drop() function of Write, we can drop the input buffer while the DMA is still reading.

We can thank the designers of async Rust for the drop guarantee.

Remember that my position is unconventional. I'd like to understand what's changing your thinking, and I'm still seeking answers to my specific questions.

Sorry, but I don't know why the drop guarantee is not covered in the Embedonomicon's DMA chapter. From my position, the drop guarantee, and intentional design towards the drop guarantee, can solve the soundness concerns reiterated throughout the chapter.

Finomnis commented 1 year ago

Im still not entirely sure that that's actually what the drop guarantee is taking about, but so far I have yet to find an example that states otherwise.

Finomnis commented 1 year ago

I'm afraid though that Pin still isn't enough. From the Drop Guarantee:

Notice that this guarantee does not mean that memory does not leak! It is still completely okay to not ever call drop on a pinned element (e.g., you can still call mem::forget on a Pin<Box>). In the example of the doubly-linked list, that element would just stay in the list. However you must not free or reuse the storage without calling drop.

Note that in order to regain control of a &mut lifetime that is bound in an object, it is enough to forget it. No drop is required.

I'm not sure what exactly prevents this from happening in all the examples I've looked at so far; it feels like there's always something enforcing a drop when dealing with futures. I still don't know where this comes from, though.

mciantyre commented 1 year ago

Sorry, but I'm missing the connection between the quote from the drop guarantee docs and this observation:

Note that in order to regain control of a &mut lifetime that is bound in an object, it is enough to forget it. No drop is required.

Could you elaborate?


I'm not sure what exactly prevents this from happening in all the examples I've looked at so far; it feels like there's always something enforcing a drop when dealing with futures. I still don't know where this comes from, though.

Could you share the examples you're studying? I looking for prompts so that I can dispute my own position. Even if the example looks good, there might be small tweaks that we can come up with to make the example troublesome.

Finomnis commented 1 year ago

@mciantyre This is what I'm talking about:

use std::{marker::PhantomPinned, pin::Pin};

struct AutoClear<'a> {
    val: &'a mut [u8],
    _pin: PhantomPinned,
}

impl<'a> AutoClear<'a> {
    pub fn new(val: &'a mut [u8]) -> Pin<Box<Self>> {
        Box::pin(Self {
            val,
            _pin: PhantomPinned,
        })
    }
}

impl Drop for AutoClear<'_> {
    fn drop(&mut self) {
        println!("Drop!");
        self.val.fill(0);
    }
}

fn main() {
    {
        {
            let mut data = [1, 2, 3];
            {
                let _autoclear = AutoClear::new(&mut data);

                // Not usable, as expected. The following is a compile time error:
                // data[0] = 42;
            }
            // Prints [0, 0, 0], as expected
            println!("{:?}", data);
        }

        {
            let mut data = [1, 2, 3];
            {
                let autoclear = AutoClear::new(&mut data);

                std::mem::forget(autoclear);

                // Drop never got called, and yet the lifetime of `&mut data` is free again now.
                // It is still bound somewhere, but the fact that `autoclear` is not reachable
                // any more is enough for the borrow checker to release `data` again.

                // Note that we didn't use `unsafe`, and that autoclear is `!Unpin` and was pinned.

                data[0] = 42;
            }
            // Prints [42, 2, 3], `AutoClear::drop()` never got called. Yet `data` is reachable and modifiable again.
            println!("{:?}", data);
        }
    }
}
Drop!
[0, 0, 0]
[42, 2, 3]
Finomnis commented 1 year ago

Could you share the examples you're studying?

There aren't any specific ones; I was just trying to recreate the effect shown in the previous message with a Future whos poll function got called at least once, but I don't seem to be able to. There's always something getting in the way. I don't think that there is any drop-less way out of the pin_mut!() macro shown earlier (as it moves the value and then shadows it). However, I'm not sure if this macro is overly restrict, because I can't find any formal written rules that would prevent this effect from happening in the general case.

I think that pin_mut!() is that restrictive because it wants to prevent that anyone core::mem::swaps the underlying data or similar. But I don't think that it actually has anything to do with the drop guarantee. But maybe I'm wrong; the problem is that I didn't find any more documentation that specifies how an async executor has to be implemented properly or what exactly the rules of pinning are in that regard. Box::pin() obviously allowed the forget(), so I'm not sure why pin_mut!() doesn't.

So the most obvious way to transfer my code to a Future is to Box::pin() the future and then call its poll function. But poll requires a Waker, and I didn't find enough information on how to properly create one manually. Documentation is kind of spare, too. And I wasn't able to get a Box<Pin<Future>> to work with Cassette. I didn't try with other executors, maybe you have more ideas, or reasoning why my previous example isn't transferrable to futures?

mciantyre commented 1 year ago

I see the issue. I agree that a forgotten, pinned, heap-allocated DMA transfer object satisfies the drop guarantee while also performing I/O into likely-invalid stack memory / mutating a &mut buffer. Thanks for going deep and explaining this for me.

You've revealed my implicit assumption: I only considered stack-allocated DMA transfer objects. When we forget an object on the stack, its memory should be considered invalid as late as the end of scope. I figured that's why Pin::new_unchecked is unsafe, and why pin_mut! -- a tool for stack pinning -- does the shadow. I never thought about heap-allocated DMA transfers objects, and never noticed that Box::pin() / Box::into_pin() are safe; makes sense if forgets shouldn't matter.

Although I haven't tried, I think I can produce the troublesome example we're looking for (without an executor). I'll give it a shot this weekend. If you want to keep going with your example, here is a way to create a simple Waker. cassette may also have a similar Waker for study.

Finomnis commented 1 year ago

@mciantyre Fyi, pin!() got stabilized recently and does not require local shadowing.

mciantyre commented 1 year ago

I don't think a lack of shadowing in pin!() is a problem. Shadowing is the approach used by pin_mut!() to prevent access to the original stack-allocated object. There's other ways to prevent access to the object, like ownership transfers. That's how pin!() seems to work. (Its implementation comments describe how it performs the move. I guess its status in the standard library gives it power that external pin_mut!() macros can't achieve.)

Here's a contrived example of stack pinning with pin!(). After pinning, and despite using unique names, I still can't forget the original stack-allocated DMA object.

Does pin!() present a failure mode that I'm missing?


Here's a small example showing the heap-allocated DMA future unsoundness. I only built it successfully; didn't run it. There's no executor. It represents a soundness issue for a user who's polling futures, including an executor.

Discussions here and in the example focus on corruption of the stack-allocated buffer. Just noting that the DMA channel object can also be invalidated once the heap-allocated DMA object is forgotten. Any solution that doubles-down on the drop guarantee will also need to consider the channel lifetime.

Finomnis commented 1 year ago

Does pin!() present a failure mode that I'm missing?

No, I think you are right :)

Finomnis commented 8 months ago

Let's keep it as-is. I really like the async based solution that we have now, it beautifully integrates into RTIC 2's execution model.