Open jgrivera67 opened 4 months ago
Hi @jgrivera67! We appreciate you submitting your first issue for our open-source project. 🌟
Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙
Hi @jgrivera67, could you please confirm if you needed the attached "ARMv8-R-SMP-bug-fixes-patch.txt" or was setting the VBAR for secondary cores enough to fix the issue?
Hi @jgrivera67, could you please confirm if you needed the attached "ARMv8-R-SMP-bug-fixes-patch.txt" or was setting the VBAR for secondary cores enough to fix the issue?
Hi @wearyzen, the changes from the patch are also necessary. The changes in smp.c are required for correctness as follows:
arm_cpu_boot_params.fn = fn;
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
barrier_dsync_fence_full(); // BUG FIX <--------- TO ENSURE THAT ASSIGNMENT BELOW IS SEEN IN THE RIGHT ORDER BY OTHER CORES
/ store mpid last as this is our synchronization point / arm_cpu_boot_params.mpid = cpu_mpid; @@ -152,6 +153,10 @@ void arch_cpu_start(int cpu_num, k_thread_stack_t stack, int sz, archcpustart / Wait secondary cores up, see arch_secondary_cpu_init */ while (arm_cpu_boot_params.fn) { wfe();
The change in cpu.h is necessary for multi-core platforms that have more than one cluster of cores, to ensure that each core in the system is unambiguously identified.
@jgrivera67 did you test the patch with core caches enabled? Or there are still other issues when caches are enabled? Just asking because I remember something you mentioned about this on discord.
@jgrivera67 did you test the patch with core caches enabled? Or there are still other issues when caches are enabled? Just asking because I remember something you mentioned about this on discord.
Hi @manuargue , booting zephyr on a multi-core non-cache coherent Cortex-R52 platform does not work with caches enabled. The "cache invalidation" change I proposed is more for consistency than to solve the non-cache-coherent problem. Since there was already a "cache invalidate" before the loop, we should have a "cache invalidate" inside the loop as well, as having the "cache invalidate". Even better, this loop that read arm_cpu_boot_params.fn
should use an "atomic load" instead (as shown below), and the code that updates it should use an "atomic store" . This way cache invalidation would not be necessary in the loop. Also, for non-cache coherent SMP support, we should have a special implementation of all atomic primitives that do cache invalidation in the ldaex/stlex loops.
arm_cpu_boot_params.fn = fn;
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
barrier_dsync_fence_full(); // BUG FIX
/* store mpid last as this is our synchronization point */
arm_cpu_boot_params.mpid = cpu_mpid;
barrier_dsync_fence_full();
/*! TODO: Support PSCI
* \todo Support PSCI
*/
/* Wait secondary cores up, see arch_secondary_cpu_init */
while (atomic_load(&arm_cpu_boot_params.fn)) { <-------------------- DO ATOMIC LOAD here
wfe();
}
@jgrivera67, thank you for your quick feedback.
I understood the cache invalidate and affinity changes but I see there is already a call to barrier_dsync_fence_full() after the mpid
is set in the below code so, the call before it is a bit confusing. As I understand since this patch is not actually tested could I assume that the extra call to barrier_dsync_fence_full()
is not needed?
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
barrier_dsync_fence_full(); // BUG FIX ----> Is this needed since there is an Existing call just below
/* store mpid last as this is our synchronization point */
arm_cpu_boot_params.mpid = cpu_mpid;
barrier_dsync_fence_full(); ----> // Existing call
Please correct me if I misunderstood or missed something and thank you again for the explanation and suggestions.
Hi @wearyzen, actually, I think the second (existing) barrier_dsync_fence_full()
could be removed, because the one necessary for cross-core correctness is the first one (new one). Secondary cores execute the following polling loop when they come out of reset:
dmb ld
ldr r2, [r0, #BOOT_PARAM_MPID_OFFSET] <-------- (**) read arm_cpu_boot_params.mpid
cmp r1, r2
bne 1b
A secondary core break out of this loop when the primary core executes the following for the MPID of the secondary core:
/* store mpid last as this is our synchronization point */
arm_cpu_boot_params.mpid = cpu_mpid;
So, we need to make sure that by the time that secondary core does the load at (**), the following assignments executed by the primary core are already committed to memory:
arm_cpu_boot_params.fn = fn;
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
To ensure that we need to add the memory barrier proposed in the patch:
arm_cpu_boot_params.fn = fn;
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
barrier_dsync_fence_full(); // BUG FIX
/* store mpid last as this is our synchronization point */
arm_cpu_boot_params.mpid = cpu_mpid;
Hi @jgrivera67,
I noticed that at smp.c#144, the cache is invalidated but not flushed after the primary core updates the arm_cpu_boot_params.mpid
so, the non-cache coherent secondary cores might not get the updated value. Could you please try the below patch along with your previous changes to set VBAR to see if booting works with caches enabled?
diff --git a/arch/arm/core/cortex_a_r/smp.c b/arch/arm/core/cortex_a_r/smp.c
index 0214b4715f6..d01087122a2 100644
--- a/arch/arm/core/cortex_a_r/smp.c
+++ b/arch/arm/core/cortex_a_r/smp.c
@@ -137,11 +137,12 @@ void arch_cpu_start(int cpu_num, k_thread_stack_t *stack, int sz, arch_cpustart_
arm_cpu_boot_params.arg = arg;
arm_cpu_boot_params.cpu_num = cpu_num;
+ barrier_dsync_fence_full();
+
/* store mpid last as this is our synchronization point */
arm_cpu_boot_params.mpid = cpu_mpid;
- barrier_dsync_fence_full();
- sys_cache_data_invd_range(
+ sys_cache_data_flush_and_invd_range(
(void *)&arm_cpu_boot_params,
sizeof(arm_cpu_boot_params));
Hi @wearyzen, this change does not help because the data cache for Cortex-R52 is "write-through", so doing a "flush & invalidate" instead of just "invalidate" does not make a difference.
Hi @jgrivera67,
I am trying to find to find a setup to try this but it could take a while. Meanwhile could you please help me with below questions?
z_arm_reset
of reset.S?Hi @wearyzen,
z_arm_platform_init
which is invoked from EL1_Reset_Handler
in arch/arm/core/cortex_a_r/reset.S
:
void z_arm_platform_init(void)
{
...
__set_SCTLR(__get_SCTLR() & ~SCTLR_C_Msk);
...
}
2. I don't need to disable the data cache to get past the loop at [here](https://github.com/zephyrproject-rtos/zephyr/blob/c44abdf5b7932b080f31e90cebbec07a8e092068/arch/arm/core/cortex_a_r/reset.S#L217). However the other fixes from the patch I included are necessary. However, if I don't disable the data caches, the secondary core gets stuck in the idle thread, and the primary core executes an application thread once, and then it gets stuck also in the idle thread.
3. Below is the UART output when booting my dual-core cortex-R52 platform when data caches are enabled on both cores:
Booting Zephyr OS build v3.6.0-7683-g1f3e2e55af9c Secondary CPU core 1 (MPID:0x10001) is up thread_a: Hello World from cpu 0
The expected output should be:
Booting Zephyr OS build v3.6.0-7683-g1f3e2e55af9c Secondary CPU core 1 (MPID:0x10001) is up thread_a: Hello World from cpu 0 thread_b: Hello World from cpu 1 thread_a: Hello World from cpu 0 thread_b: Hello World from cpu 1 thread_a: Hello World from cpu 0 thread_b: Hello World from cpu 1 ...
The only way to get this test app to work as expected is by disabling data caches on both cores.
Hi @jgrivera67, just wanted to let you know that I was able to reproduce the scenario and am working on it. I'll let you know if I have any updates on it.
@jgrivera67 after further investigation, it appears that making the Zephyr Kernel work on incoherent-cache systems would require more work than we have bandwidth for.
Currently Zephyr requires all kernel object to be in cache-coherent (or non-cached) area. You may be able to achieve this with dts configuration (marking some kernel region as non-cacheable) but this would need to be specific to your platform.
You may be able to achieve this with dts configuration (marking some kernel region as non-cacheable) but this would need to be specific to your platform.
hi @ithinuel, could you please share which kernel regions must be marked as non-cacheable? Thanks
Hi @manuargue, As far as I understand it is not a specific region. It is any kernel objects (so possibly dynamically allocated too?).
I am not very familiar with Cortex-R52 and don't know your system, so what follows is speculative:
You should be able to enable the MPU background region and its 0x6000_0000-0x7FFF_FFFF
space as a Normal, Shareable, Non-cacheable
space for zephyr's memory.
Then define a specific region that you'd like to be cached.
See the Cortex-R52's TRM
This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.
I'm trying to boot zephyr SMP on a non-cache coherent multi-core Cortex-R52 platform. Even with disabling the data cache for all cores, still zephyr crashes on the secondary cores. It turns out that this happens because zephyr only sets the VBAR (vector table base address) register for the primary core, but not for the secondary cores, which leaves the VBAR for the secondary cores with its reset value: 0. The problem is that if the zephyr image is loaded at an address other than 0x0, this will not work. I did an experiment setting the VBAR for the secondary cores to where the vector table is in my zephyr image and Zephyr booted fine on the secondary cores. So, is this an oversight in the zephyr port for ARMv8-R, or am I missing something?
Also, there are some other minor issues that I think should be fixed as part of correctly supporting SMP on ARMv8-R. Please see the patch file attached. ARMv8-R-SMP-bug-fixes-patch.txt