Open danielappiagyei-bc opened 4 months ago
Hi @danielappiagyei-bc you said it is easy to reproduce, do you have a simple example? Maybe it could be included in the testing ostest to catch this kind of issue.
Maybe @davids5 or @patacongo can have some ideas why register/volatile prevents the issue from happening.
Will get back to this in a few weeks, appreciate the patience, am busy with work and personal
In armv7-m/arm_cache.c (potentially other arch's arm_cache.c files as well, haven't looked),
up_disable_dcache()
reads some loop variables from memory, disables d cache, then cleans and invalidates the cache in a loop. When the cache had been configured to operate in WRITE_BACK mode, I am seeing that these variables:, which are set only once before the do-while loops, have their values changed mid-loop from
ways = 3
,sets = 255
on the imxrt-1064, to gigantic large numbers like 1 billion, causing the loop to execute incorrectly and causing a crash. This behavior is consistent and reproducible.I had to look at the NXP-provided code for disabling dcache to help understand the error:
They use the register keyword when declaring the function's local variables to somehow prevent the invalidation and cleaning of dcache from mangling their values. (My understanding is that the compiler isn't guaranteed to put the value in a register, it's just a hint which modern compilers often ignore since they're better at optimization. The keyword is also deprecated in c++.).
Anyway, I modified nuttx's
up_dcache_disable()
to declare every local variable as volatile, which has the effects of disabling compiler optimizations such as out-of-order execution for that variable. For what it's worth, I was compiling on gcc with-Os
optimization level. From cppreference onvolatile
:This change fixed the bug and stopped my app from crashing. Maybe someone more familiar with cache and c/cpp language can better explain why
register
andvolatile
worked? If it's just about preventing reordering of executions by the compiler then maybe instead ofvolatile
, someARM_ISB()
orARM_DMB()
's would work as well? Haven't tested but just wanted to let you guys know of this issue and that it may be present in other arm arch files.