Closed npitre closed 2 years ago
Why not at least describe the issues here? Like, copy them from the PR, as this issue is somehow created from the PR. Why was this created at all?
Fair enough, will do.
Why was this created at all?
Because this is the formal procedure of when you want to merge an hotfix in the upcoming release quickly.
A few issues prevent RISC-V from working properly in an SMP configuration.
Fixes for those issues are provided in PR #50679.
riscv: pmp: fix stackguard when used on SMP
The IRQ stack in particular is different on each CPU, and so is its stack guard PMP entry value. This creates 2 issues:
The assertion ensuring the last global PMP address is the same for each CPU does fail;
That last global PMP address can't be relied upon to create a single-slot per-thread TOR mapping.
Fix both issues by not remembering the actual address for the last global entry but a dummy address instead that is guaranteed not to match any opportunistic single-slot TOR mapping.
riscv: PMP-based stack guard is incompatible with stack sentinel
The software-based stack sentinel writes to the very bottom of the stack area triggering the PMP stack protection. Obviously they can't be used together.
riscv: fix crash resulting from touching the initial stack's guard area
The interrupt stack is used as the system stack during kernel initialization while IRQs are not yet enabled. The sp register is set to z_interrupt_stacks + CONFIG_ISR_STACK_SIZE.
CONFIG_ISR_STACK_SIZE only represents the desired usable stack size. This does not take into account the added guard area. Result is a stack whose pointer is much closer to the trigger zone than expected when CONFIG_PMP_STACK_GUARD=y, and the SMP configuration in particular pushes it over the edge during many CI test cases.
Worse: during early init we're not quite ready to handle exceptions yet and complete havoc ensues with no meaningful debugging output.
Make sure the early assembly code locates the actual top of the stack by generating a constant with its true size.
tests/semaphore: fix "cpu test took too long" assertion failure
The SMP config for RISC-V on QEMU triggers this:
Looping 10000 times is maybe a bit excessive.
riscv: smp: update the qemu_riscv32/64 configs
No usermode nor stackguard CI tests are performed if CONFIG_RISCV_PMP is not set.
In turn, this requires a larger privileged stack on RV64 just like the non SMP case.