Closed thalesfragoso closed 4 years ago
@thalesfragoso Note that fences
at present do not work on msp430. And I am not certain how to fix this as I don't handle the LLVM side of things for msp430 :/.
@cr1901 a compiler fence could be implemented/used for MSP430, it's just a type 0 of LLVM AtomicFence that lowers to nothing for x86 and ARM.
How are core library mutexes doing it wrt to acquire/release?
Mutexguards should have a similar problem, ain’t they?
How are core library mutexes doing it wrt to acquire/release?
Mutexguards should have a similar problem, ain’t they?
I think that in the case of the std::sync::MutexGuard
they can rely on drop because if the guard doesn't get dropped the mutex will remain locked forever. That might also make sense for us, and I was going to try that today, but I realized that CriticalSection
is Copy
and because of that we can't implement Drop
for it.
As discussed in the matrix chat, it seems that fence
(instead of compiler_fence
) is only needed when dealing with a multi master bus, which it's out of scope for this crate, so I changed it to use just a compiler_fence
.
I also provided a solution for the msp430 case, but the one I found will only compile on nightly, but that seems to be the case in the msp430
crate too, so it seems okay. CC @cr1901
CC @adamgreig
@thalesfragoso I don't have a problem making bare_metal
nightly on msp430
; we can fix that later. The msp430
crate is indeed nightly for now and will likely be so until asm!
stabilizes.
I want a second opinion though: cc: @pftbest and @awygle.
Could we just document that CriticalSection::new
requires the caller ensure a compiler fence is inserted and not otherwise change the body? That way cortex-m
and msp430
will continue to work (since both take actions that result in fences being inserted when creating a CS), and no new code is added here.
That doesn't resolve the get_mut()
issue which I am not yet convinced we actually need to change but I'm certainly open to being wrong there. I'm also not sure if riscv effectively inserts a fence when disabling interrupts, as I don't really understand how its register modification works.
I also agree with @adamgreig, the fences should be added only when CriticalSection is created, not on every borrow. For example, you obtained a critical section and then do 3 borrows in a row:
let cs = ...
let a = *FOO.borrow(cs).borrow();
let b = *BAR.borrow(cs).borrow();
let c = *FOO.borrow(cs).borrow();
You don't need a fence inserted in between each borrow here, because nothing else can change the values of FOO and BAR while you are holding a critical section token. Adding fences here would be a pessimisation, because compiler will no longer be able to see that a
and c
is the same exact value and optimize out second access to FOO.
I also agree with @adamgreig, the fences should be added only when CriticalSection is created, not on every borrow. For example, you obtained a critical section and then do 3 borrows in a row:
That's already the case.
That's already the case.
I missed that, sorry.
To summarise discussion in today's meeting on Matrix: we require a fence at the end of the critical section as well, because otherwise the compiler could re-order a store to after the CS finishes, which could race with another thread. Since the CS might be safely forgotten, there's no way to ensure this inside the CS type. Instead, we should add a safety precondition to the CS documentation to say that users (the architecture crates such as cortex-m, cortex-a, msp430, riscv, etc) must ensure a compiler fence is inserted immediately after entering and before leaving the CS.
Currently this is mostly the case, for example cortex-m uses either an FFI call or inline asm with a memory clobber to enable and disable interrupts, which has the effect of ensuring a suitable fence. We should make these fences explicit (although it's a shame that compiler_fence
has a potential panic in debug builds).
So, the update to bare_metal
can probably just be a safety documentation on CriticalSection
, but the architecture crates will all need updating to ensure suitable fences are used.
I don't think this resolves the Mutex::get_mut()
concern yet though. I'm not sure how that should best be resolved. Since it requires &mut self
, there shouldn't be any way in safe Rust to trigger this problem anyway...
As it was discussed in the meeting, it seems that the best approach to this problem is to add one more condition to the CriticalSection::new
method and rely on the arch crates to implement it correctly. Moreover, a closure approach seems to deal with this better.
I'm closing this in favor of this new solution, bare-metal
should only need to update its documentation.
Without the fences the Mutex doesn't behave as expected.
Given this example:
It's expected that the value of
FOO
will be evaluated at each loop pass, right now that doesn't happen,FOO
is only evaluated once and if the value is false then the function branchs to an infinite loop:godbolt link (Remove the comment to add the fence and see the difference)
I decided to go for a
fence
instead of acompiler_fence
to cover the cases where the processors have a cache or are out of order. We could changeSeqCst
for a pair ofAcquire
andRelease
, but then I think we would have to put one of them in theDrop
implementation of theCriticalSection
, but we can't really rely that drop will run (mem::forget
), so I went withSeqCst
, maybe I'm missing some other way...It seems that we would need to yank previous versions, unfortunately.