Open japaric opened 5 years ago
AFAICT, there's no Cortex-M memory mapped register that returns a "core id"
Yeah, on the homogenous dual Cortex-M4 part of my Beagleboard X15 demo, I found the TI example using the Cortex-M4 Peripheral ID 0 register to determine which core was running. See https://github.com/cambridgeconsultants/rust-beagleboardx15-demo/blob/master/bare-metal/ipu-demo/src/main.rs#L435. Unfortunately it's vendor defined as to what they put in there :/ Perhaps it's something we can bring out in the chip crates, instead of being able to solve it for all Cortex-M cores.
The 'core local take' would work if the lock variable was in a core-specific address space, like the peripheral, right? So, like a thread-local variable. I think the AM5728 had a block of RAM for each core, each mapped to the same address range, but I don't know if that's a universal thing or just an oddity of this implementation. You'd also need to ensure each core initialised its own memory correctly.
I think the current implementations of Send
are already correct. In standard Rust, Send
models transferring data between threads within the same process (which have access to all of its resources), which is akin to moving between interrupt handlers running on the same core, or between cores that share a single set of resources (that is, all memory and peripherals).
Since most (all?) multi-core MCUs have core-local peripherals or memory, transferring objects between the cores is more similar to cross-process communication in standard Rust and should not be modeled with Send
, but a custom trait.
Background
Private Peripheral Bus (PPB)
All the peripherals in the
cortex_m::Peripherals
struct sit on the Private Peripheral Bus and have addresses of the form0xE00x_xxxx
. Most of these peripherals are interfaces to internal resources -- there's one instance of those resources per core. Examples of internal PPB peripherals are theNVIC
,MPU
,SYST
(system timer),ITM
, etc. The rest of peripherals are interfaces to external resources meaning that all cores access the same resources through these interfaces. An example of an external resource is theTPIU
.The bottom line here is that interacting with a singleton like
CPUID
on one core is different from interacting with it from another core even though they are supposed to be the same singleton. For example, theCPUID.base.read()
operation can return different values depending on which core is executed.Multi-core API
In RTFM land we are exploring multi-core applications in two modes: homogeneous mode and heterogeneous mode. Of course, it's very early days so we still don't know what the ecosystem will adopt but these APIs let us show the problems with the peripheral singletons.
Homogeneous
In this mode a binary crate is compiled using a single (compilation) target and a single ELF image is produced. The image contains the entry points for all cores and
static
variables have global visibility (visible to all cores) by default.If this program is compiled for the
thumbv8m.main-none-eabi
then the resulting image can be flashed, for example, on a 2x Cortex-M33 device.Heterogeneous
In this mode a binary crate is compiled for multiple (compilation) targets so multiple ELF images are produced. There's one image for each core and each image contains the entry point and
static
variables for that core. There's a special linker section named.shared
used to sharestatic
variables between cores: make them visible to all cores. One opts into this.shared
section using the#[shared]
attribute; the default is thatstatic
variables are visible only to the core where it was defined.If this program is compiled for the
thumbv7em-none-eabihf
andthumbv6m-none-eabi
targets then the resulting image can be flashed on a Cortex-M4F + Cortex-M0+ device, for example.Issues
A.
Send
is wrongBy definition,
Send
means that it's (memory) safe to transfer ownership of a resource from one thread / core to another. In the case of these peripheral singletons transferring them from one core to another is wrong because that changes the meaning of the value.The other issue with
Send
is that makes it possible to break the singleton invariant: one core can send another instance of e.g.DWT
to a core that already has one such instance.B.
take()
is unsoundThe current implementation of
cortex_m::Peripherals::take
looks like this:This is unsound in homogeneous multi-core mode because
interrupt::free
doesn't synchronize multi-core access tostatic mut
variables; it only synchronizes accesses from the same core.Using
AtomicBool.compare_swap
or a similar API would make this multi-core memory safe but that would not work on ARMv6-M because that CAS API doesn't exist onthumbv6m-none-eabi
.C.
take()
is wrongThe following program panics in homogeneous multi-core mode but should work.
This panics because both
Peripherals::take
are accessing the same guard. However, it is OK for each core to access its own DWT peripheral / cycle counter instance.Potential countermeasures
!Send
To avoid issue (A) we could remove the
Send
implementation from all the peripheral singletons. The downside of this approach is we would also forbid sending a peripheral singleton frommain
or a interrupt handler to another within the same core.LocalSend
Another alternative to avoid (A) is to remove the
Send
implementation from the singletons and instead implement a newLocalSend
trait that means safe to send within execution contexts running on the same core. Then frameworks like RTFM can require theLocalSend
for message passing within one core and theSend
trait for cross-core message passing.The downside of this approach is that it requires using the nightly channel because
auto trait
s, which are required to bridgeSend
andLocalSend
, are unstable.Core-local
take
One way to deal with (B) and (C) is to have one guard
static
variable per core. Assuming a dual core systemtake
would be written as follows:The problem is implementing
core_id
in homogeneous mode. AFAICT, there's no Cortex-M memory mapped register that returns a "core id" (cf. RISC-Vmhartid
); nor there is a processor register that can be used to hold a "core id" (cf. RISC-V registers:x3
(global pointer) andx4
(thread pointer)) -- though one could use the usually unused PSP (Process Stack Pointer) register for this purpose.Global singletons
A completely different approach to peripheral access that avoids the three aforementioned issues is a global singleton API. For example:
The downside of this approach is that because there's no ownership it's hard to build abstractions on top of peripherals. One could add a panicky API to seal peripherals:
Ownership could be emulated by first sealing the peripheral and then having the abstraction access the peripheral exclusively through the
borrow_unchecked
API.Internal vs external resources
In the case of external resources like the
TPIU
I think we want to keep the existing owned singleton /take
API, with the semantics that only one core can take these peripherals, because more than one core should not access these resources in an unsynchronized fashion.