rust-embedded / bare-metal

Abstractions common to microcontrollers
Apache License 2.0
116 stars 17 forks source link

Add fences to mutex methods #33

Closed thalesfragoso closed 4 years ago

thalesfragoso commented 4 years ago

Without the fences the Mutex doesn't behave as expected.

Given this example:

static FOO: Mutex<RefCell<bool>> = Mutex::new(RefCell::new(false));

pub fn main() {
    let mut count = 0;
    loop {
        let cs = unsafe { CriticalSection::new() };
        if *FOO.borrow(cs).borrow() {
            break
        }
        count += 1;
    }
    let mut black_box = 0;
    unsafe { ptr::write_volatile(&mut black_box as *mut _, count); }
}

#[no_mangle]
pub extern "Rust" fn bar() {
    let cs = unsafe { CriticalSection::new() };
    *FOO.borrow(cs).borrow_mut() = true;
}

It's expected that the value of FOO will be evaluated at each loop pass, right now that doesn't happen, FOO is only evaluated once and if the value is false then the function branchs to an infinite loop:

.LBB1_2:
        b       .LBB1_2

godbolt link (Remove the comment to add the fence and see the difference)

I decided to go for a fence instead of a compiler_fence to cover the cases where the processors have a cache or are out of order. We could change SeqCst for a pair of Acquire and Release, but then I think we would have to put one of them in the Drop implementation of the CriticalSection, but we can't really rely that drop will run (mem::forget), so I went with SeqCst, maybe I'm missing some other way...

It seems that we would need to yank previous versions, unfortunately.

cr1901 commented 4 years ago

@thalesfragoso Note that fences at present do not work on msp430. And I am not certain how to fix this as I don't handle the LLVM side of things for msp430 :/.

Disasm commented 4 years ago

@cr1901 a compiler fence could be implemented/used for MSP430, it's just a type 0 of LLVM AtomicFence that lowers to nothing for x86 and ARM.

andre-richter commented 4 years ago

How are core library mutexes doing it wrt to acquire/release?

Mutexguards should have a similar problem, ain’t they?

thalesfragoso commented 4 years ago

How are core library mutexes doing it wrt to acquire/release?

Mutexguards should have a similar problem, ain’t they?

I think that in the case of the std::sync::MutexGuard they can rely on drop because if the guard doesn't get dropped the mutex will remain locked forever. That might also make sense for us, and I was going to try that today, but I realized that CriticalSection is Copy and because of that we can't implement Drop for it.

thalesfragoso commented 4 years ago

As discussed in the matrix chat, it seems that fence (instead of compiler_fence) is only needed when dealing with a multi master bus, which it's out of scope for this crate, so I changed it to use just a compiler_fence.

I also provided a solution for the msp430 case, but the one I found will only compile on nightly, but that seems to be the case in the msp430 crate too, so it seems okay. CC @cr1901

CC @adamgreig

cr1901 commented 4 years ago

@thalesfragoso I don't have a problem making bare_metal nightly on msp430; we can fix that later. The msp430 crate is indeed nightly for now and will likely be so until asm! stabilizes.

I want a second opinion though: cc: @pftbest and @awygle.

adamgreig commented 4 years ago

Could we just document that CriticalSection::new requires the caller ensure a compiler fence is inserted and not otherwise change the body? That way cortex-m and msp430 will continue to work (since both take actions that result in fences being inserted when creating a CS), and no new code is added here.

That doesn't resolve the get_mut() issue which I am not yet convinced we actually need to change but I'm certainly open to being wrong there. I'm also not sure if riscv effectively inserts a fence when disabling interrupts, as I don't really understand how its register modification works.

pftbest commented 4 years ago

I also agree with @adamgreig, the fences should be added only when CriticalSection is created, not on every borrow. For example, you obtained a critical section and then do 3 borrows in a row:

    let cs = ...
    let a = *FOO.borrow(cs).borrow();
    let b = *BAR.borrow(cs).borrow();
    let c = *FOO.borrow(cs).borrow();

You don't need a fence inserted in between each borrow here, because nothing else can change the values of FOO and BAR while you are holding a critical section token. Adding fences here would be a pessimisation, because compiler will no longer be able to see that a and c is the same exact value and optimize out second access to FOO.

thalesfragoso commented 4 years ago

I also agree with @adamgreig, the fences should be added only when CriticalSection is created, not on every borrow. For example, you obtained a critical section and then do 3 borrows in a row:

That's already the case.

pftbest commented 4 years ago

That's already the case.

I missed that, sorry.

adamgreig commented 4 years ago

To summarise discussion in today's meeting on Matrix: we require a fence at the end of the critical section as well, because otherwise the compiler could re-order a store to after the CS finishes, which could race with another thread. Since the CS might be safely forgotten, there's no way to ensure this inside the CS type. Instead, we should add a safety precondition to the CS documentation to say that users (the architecture crates such as cortex-m, cortex-a, msp430, riscv, etc) must ensure a compiler fence is inserted immediately after entering and before leaving the CS.

Currently this is mostly the case, for example cortex-m uses either an FFI call or inline asm with a memory clobber to enable and disable interrupts, which has the effect of ensuring a suitable fence. We should make these fences explicit (although it's a shame that compiler_fence has a potential panic in debug builds).

So, the update to bare_metal can probably just be a safety documentation on CriticalSection, but the architecture crates will all need updating to ensure suitable fences are used.

I don't think this resolves the Mutex::get_mut() concern yet though. I'm not sure how that should best be resolved. Since it requires &mut self, there shouldn't be any way in safe Rust to trigger this problem anyway...

adamgreig commented 4 years ago

For a little historical context: https://github.com/rust-embedded/cortex-m/commit/a396a68a43e684b4388cfdcaa0caacbe0dcc4b71 and https://github.com/rust-embedded/cortex-m/commit/93c901d7489ceeea4624b680e96c3e8867a9f08a

thalesfragoso commented 4 years ago

As it was discussed in the meeting, it seems that the best approach to this problem is to add one more condition to the CriticalSection::new method and rely on the arch crates to implement it correctly. Moreover, a closure approach seems to deal with this better.

I'm closing this in favor of this new solution, bare-metal should only need to update its documentation.