zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.49k stars 6.42k forks source link

tests/kernel/common fails at test_atomic on twr_ke18f board #16229

Closed hakehuang closed 5 years ago

hakehuang commented 5 years ago

Describe the bug test_atomic meets mpu failure

To Reproduce Steps to reproduce the behavior:

  1. mkdir build; cd build
  2. cmake -DBOARD=twr_ke18f ..
  3. make
  4. See error

Expected behavior test PASS

Impact kernel can not work well

Screenshots or console output

Running test suite common

===================================================================

starting test - test_verify_bootdelay

PASS - test_verify_bootdelay

===================================================================

starting test - test_irq_offload

PASS - test_irq_offload

===================================================================

starting test - test_byteorder_memcpy_swap

PASS - test_byteorder_memcpy_swap

===================================================================

starting test - test_byteorder_mem_swap

PASS - test_byteorder_mem_swap

===================================================================

starting test - test_atomic

ASSERTION FAIL [0] @ ./arch/arm/core/cortex_m/mpu/nxp_mpu.c:563

    Configuring 3 dynamic MPU regions failed

[00:00:00.560,000] <err> mpu: Failed to allocate new MPU region 8


***** Kernel Panic! *****

Current thread ID = 0x2000077c

Faulting instruction address = 0xda4c

Environment (please complete the following information):

hakehuang commented 5 years ago

@MaureenHelm

ioannisg commented 5 years ago

I can check it on the freedom board - is this similar?

hakehuang commented 5 years ago

@ioannisg which frdm board? we do not have ke18 frdm support. and I do not see this issue on other zephyr enabled frdm boards.

ioannisg commented 5 years ago

@ioannisg which frdm board? we do not have ke18 frdm support. and I do not see this issue on other zephyr enabled frdm boards.

It seems the error is on the nxp_mpu.c module. I have tested this on frdm_k64fand the test is passing. So I was wondering how similar that board/MCU is to the one I have. Unfortunately, I don't have the twr_ke18fto test this directly; how many MPU regions does that board have?

hakehuang commented 5 years ago

@ioannisg according to the reference manual https://www.nxp.com/docs/en/reference-manual/KE1xFP100M168SF0RM.pdf , 8 program-visible 128-bit region descriptors, accessible by four 32-bit words each and I check the board mpu config there has only 4 region defined but according to the feature settings

/* @brief Total number of MPU slave. */
#define FSL_FEATURE_SYSMPU_SLAVE_COUNT (4)
/* @brief Total number of MPU master. */
#define FSL_FEATURE_SYSMPU_MASTER_COUNT (3)

maybe this is the problem

ioannisg commented 5 years ago

This looks, clearly, an issue with the number of MPU regions that are available.

FYI, the test_atomic is the only test in the suite that is executed in "user"mode, which comes with few default application memory definitions. And this seems to exceed the available number of regions, leading the test to failure.

@hakehuang could you please "extract" the same information for board frdm_k64f and compare?

If my assumption is right, then we can't do much; we might have to exclude this platform from this test.

hakehuang commented 5 years ago

this is k64f parts. @ioannisg

/* @brief Total number of MPU slave. */
#define FSL_FEATURE_SYSMPU_SLAVE_COUNT (5)
/* @brief Total number of MPU master. */
#define FSL_FEATURE_SYSMPU_MASTER_COUNT (6)

besides, can we do some thing to make this case still workable on ke platform, skip a case makes me nervous. and many other failures may related the same failure. e.g. #16225

henrikbrixandersen commented 5 years ago

@hakehuang @ioannisg I contributed the NXP KE1xF port, but I think I'll need some help in understanding this issue further.

hakehuang commented 5 years ago

@henrikbrixandersen ,per my understanding, the user_mode will requires more mpu region for a give thread. checking the z_arch_configure_dynamic_mpu_regions in arch/arm/core/cortex_m/mpu/arm_core_mpu.c. but I can not figure out how to avoid this. @ioannisg correct me if I am wrong.

henrikbrixandersen commented 5 years ago

@hakehuang Thanks. I have further examined this issue and it is caused by the limited number of MPU regions available on the KE1xF series MPU.

The KE1xF has 8 regions (whereas the K6x has 12). The number of MPU regions is reflected by the FSL_FEATURE_SYSMPU_DESCRIPTOR_COUNT defintion.

5 of these regions are statically defined (debugger, two background regions due to the NXP MPU giving priority to granting permission over denying access, flash and SRAM) in soc/arm/nxp_kinetis/ke1xf/nxp_mpu_regions.c.

This leaves us 3 regions for dynamic assignment; one is used for the thread stack (ARM_CORE_MPU_NUM_MPU_REGIONS_FOR_THREAD_STACK), two are used for the stack guard (ARM_CORE_MPU_NUM_MPU_REGIONS_FOR_MPU_STACK_GUARD) due to the same reasons as for the background regions above - this leaves us with zero regions to be used for userspace.

I really do not see any way to work-around this.

andrewboie commented 5 years ago

@hakehuang Thanks. I have further examined this issue and it is caused by the limited number of MPU regions available on the KE1xF series MPU.

The KE1xF has 8 regions (whereas the K6x has 12). The number of MPU regions is reflected by the FSL_FEATURE_SYSMPU_DESCRIPTOR_COUNT defintion.

5 of these regions are statically defined (debugger, two background regions due to the NXP MPU giving priority to granting permission over denying access, flash and SRAM) in soc/arm/nxp_kinetis/ke1xf/nxp_mpu_regions.c.

Does the NXP MPU support the concept of a background mapping? That is to say, a default permission for all memory (supervisor access, user mode no access) 5 seems like a lot. Are there any opportunities to reduce their number? Keep in mind then when it comes to user mode, we are not trying to restrict memory access to any threads running in supervisor mode. For example, why do we need a region for the debugger?

This leaves us 3 regions for dynamic assignment; one is used for the thread stack (ARM_CORE_MPU_NUM_MPU_REGIONS_FOR_THREAD_STACK), two are used for the stack guard (ARM_CORE_MPU_NUM_MPU_REGIONS_FOR_MPU_STACK_GUARD) due to the same reasons as for the background regions above - this leaves us with zero regions to be used for userspace.

I really do not see any way to work-around this.

The thread stack region is required for user mode. The thread stack guard is not and that associated config can be disabled, freeing a region.

henrikbrixandersen commented 5 years ago

Does the NXP MPU support the concept of a background mapping? That is to say, a default permission for all memory (supervisor access, user mode no access)

Not as far as I can tell from the data sheet. Perhaps @MaureenHelm has more insights on the NXP MPUs and our options.

5 seems like a lot. Are there any opportunities to reduce their number?

I have tried, but I was not able to reduce the number of statically allocated regions below 5.

Keep in mind then when it comes to user mode, we are not trying to restrict memory access to any threads running in supervisor mode. For example, why do we need a region for the debugger?

As far as I understand the data sheet, MPU region 0 is special. At boot time, it grants all bus masters read/write/execute permissions to the entire address space. The CPU core can not remove those permissions, only change which bus masters in addition to the debugger are covered by region 0. This is to ensure the debugger will always have access to the entire address space.

The thread stack region is required for user mode. The thread stack guard is not and that associated config can be disabled, freeing a region.

That would actually free up two regions on the NXP MPU.