zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.86k stars 6.62k forks source link

tests/lib/mem_alloc/testcase.yaml#libraries.libc.minimal: Bus fault at test_malloc #13465

Closed cinlyooi-intel closed 5 years ago

cinlyooi-intel commented 5 years ago

To Reproduce Steps to reproduce the behavior:

  1. mkdir build; cd build
  2. cmake -DBOARD=frdm_k64f -DARCH=arm ..
  3. make
  4. make run

Screenshots or console output

******** delaying boot 1000ms (per build configuration) *****
***** Booting Zephyr OS v1.14.0-rc1-248-gd5b2834f58 (delayed boot 1000ms) *****
Running test suite test_c_lib_dynamic_memalloc
===================================================================
starting test - test_malloc
***** BUS FAULT *****
  Precise data bus error
  BFAR Address: 0x20001158
  NXP MPU error, port 3
      Mode: User, Data Address: 0x20001158
      Type: Read, Master: 0, Regions: 0x8800
***** Hardware exception *****
Current thread ID = 0x20000b3c
Faulting instruction address = 0x7ee
Fatal fault in thread 0x20000b3c! Aborting.

Environment (please complete the following information):

ioannisg commented 5 years ago

Thanks, @cinlyooi-intel , I can reproduce this one. Looking at it.

ioannisg commented 5 years ago

@andrewboie, the test is failing because it tries to access the userspace_local_data, which, if I am not mistaken, is set to:

    new_thread->userspace_local_data =
        (struct _thread_userspace_local_data *)
        (K_THREAD_STACK_BUFFER(stack) + stack_size);

but only after stack_sizeis decremented (to exclude the userspace_local_data).

The tests tries to access the data from user mode.

We need to discuss whether the error is:

The error appeared now, due to fixing #12688, where we aligned the stack protection for NXP_MPU with that of ARM_MPU. For ARM7_MPU we probably do not see the error due to rounding-up the protected area to the power of 2.

I also see the error on ARMv8-M which, like NXP MPU, does not require power-of-two alignment.

andrewboie commented 5 years ago

We have a problem in how the thread-specific storage area is being reserved. The intent of this region is for thread-specific data that the thread can manipulate directly, without having to make a syscall. Currently, we are using this to store errno.

What's happening is that the room for this area is being subtracted from the stack size being passed to _new_thread. With the NXP MPU, this results in this memory not being covered by the MPU region for the stack as it should be.

I'm still studying the code to determine the best solution. The way that stack memory bounds are accounted for differ in subtle ways across ARM, ARC, x86 and this is making this more confusing. My current thinking is that _new_thread needs a 'offset' parameter instead of subtracting the desired offset from the stack size passed in.

andrewboie commented 5 years ago

@ioannisg if you want to focus on other stuff I can handle addressing this bug.