zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.48k stars 6.41k forks source link

tests: arch: arm: fpu: arch.arm.swap.common.fpu_sharing.no_optimizations - Data Access Violation - MPU Fault #47855

Closed nordic-piks closed 2 years ago

nordic-piks commented 2 years ago

Describe the bug The tests/arch/arm/arm_thread_swap/arch.arm.swap.common.fpu_sharing.no_optimizations fails on-target tests - Data Access Violation - MPU Fault

Observed for

There is MPU Fault detected and system halts, no other output.

To Reproduce Steps to reproduce the behavior:

  1. have nrf52840dk connected
  2. go to your zephyr dir
  3. call ./scripts/twister -T tests/arch/arm/arm_thread_swap -p nrf52840dk_nrf52840 --device-testing --device-serial /dev/ttyACM0 -v --inline-logs
  4. See stack trace and system halting

Expected behavior Valid test output

Impact Not clear

Logs and console output

ERROR   - *** Booting Zephyr OS build zephyr-v3.1.0-2138-g700de003e849  ***
Running TESTSUITE E: ***** MPU FAULT *****
E:   Stacking error (context area might be not valid)
E:   Data Access Violation
E:   MMFAR Address: 0x20003038
E: r0/a1:  0xa327a9db  r1/a2:  0x353fd279  r2/a3:  0xdffefdab
E: r3/a4:  0x01085012 r12/ip:  0xc11ddb4f r14/lr:  0x86ec0d30
E:  xpsr:  0x190d6800
E: s[ 0]:  0x00000000  s[ 1]:  0x00000000  s[ 2]:  0x00000000  s[ 3]:  0x00000000
E: s[ 4]:  0x00000000  s[ 5]:  0x00000000  s[ 6]:  0x00000000  s[ 7]:  0x00000000
E: s[ 8]:  0x00000000  s[ 9]:  0x00000000  s[10]:  0x00000000  s[11]:  0x00000000
E: s[12]:  0x00000000  s[13]:  0x00000000  s[14]:  0x00000002  s[15]:  0x00000000
E: fpscr:  0x20003080
E: Faulting instruction address (r15/pc): 0xfcef4946
E: >>> ZEPHYR FATAL ERROR 2: Stack overflow on CPU 0
E: Current thread: 0x20000328 (main)
E: Halting system

Environment (please complete the following information):

hakehuang commented 2 years ago

this case is ok for mimxrt1010_evk

*** Booting Zephyr OS build zephyr-v3.1.0-2343-g7d7988d1a995  ***

    Running TESTSUITE arm_thread_swap

    ===================================================================

    START - test_arm_syscalls

    Available IRQ line: 79

    USR Thread: IRQ Line: 79

     PASS - test_arm_syscalls in 0.5 seconds

    ===================================================================

    START - test_arm_thread_swap

     PASS - test_arm_thread_swap in 0.1 seconds

    ===================================================================

    START - test_syscall_cpu_scrubs_regs

    Writing 0xDEADBEEF values into registers

    Exit from system call

     PASS - test_syscall_cpu_scrubs_regs in 0.6 seconds

    ===================================================================

    TESTSUITE arm_thread_swap succeeded

    ===================================================================

    PROJECT EXECUTION SUCCESSFUL

and frdmkw41z


*** Booting Zephyr OS build zephyr-v3.1.0-2343-g7d7988d1a995  ***

Running TESTSUITE arm_thread_swap

===================================================================

START - test_arm_syscalls

 SKIP - test_arm_syscalls in 0.1 seconds

===================================================================

START - test_arm_thread_swap

 PASS - test_arm_thread_swap in 0.1 seconds

===================================================================

START - test_syscall_cpu_scrubs_regs

 SKIP - test_syscall_cpu_scrubs_regs in 0.1 seconds

===================================================================

TESTSUITE arm_thread_swap succeeded

===================================================================

PROJECT EXECUTION SUCCESSFUL

and lpcxpresso54114_m4

*** Booting Zephyr OS build zephyr-v3.1.0-2343-g7d7988d1a995  ***

Running TESTSUITE arm_thread_swap

===================================================================

START - test_arm_syscalls

Available IRQ line: 39

USR Thread: IRQ Line: 39

 PASS - test_arm_syscalls in 0.6 seconds

===================================================================

START - test_arm_thread_swap

 PASS - test_arm_thread_swap in 0.1 seconds

===================================================================

START - test_syscall_cpu_scrubs_regs

Writing 0xDEADBEEF values into registers

Exit from system call

 PASS - test_syscall_cpu_scrubs_regs in 0.7 seconds

===================================================================

TESTSUITE arm_thread_swap succeeded

===================================================================

PROJECT EXECUTION SUCCESSFUL
microbuilder commented 2 years ago

I did a quick test with the following command, which fails as described on the nRF53 (all I had on hand):

$ zephyr/scripts/twister \
  --test "tests/arch/arm/arm_thread_swap/arch.arm.swap.common.fpu_sharing.no_optimizations" \
  -p nrf5340dk_nrf5340_cpuapp --device-testing --device-serial /dev/tty.usbmodem0009601433755 \
  -v --inline-logs

Increasing stack size to CONFIG_MAIN_STACK_SIZE=2048 causes the test to pass:

INFO    - Using Ninja..
INFO    - Zephyr version: zephyr-v3.1.0-2460-g3f22a97c48bf
INFO    - Using 'zephyr' toolchain.
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /Users/kevintownsend/Linaro/zephyr/twister-out/testplan.json

Device testing on:

| Platform                 | ID   | Serial device                  |
|--------------------------|------|--------------------------------|
| nrf5340dk_nrf5340_cpuapp |      | /dev/tty.usbmodem0009601433755 |

INFO    - JOBS: 10
INFO    - 1 test scenarios (1 configurations) selected, 0 configurations discarded due to filters.
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue

INFO    - 1 of 1 test configurations passed (100.00%), 0 failed, 0 skipped with 0 warnings in 41.37 seconds
INFO    - In total 1 test cases were executed, 0 skipped on 1 out of total 475 platforms (0.21%)
INFO    - 1 test configurations executed on platforms, 0 test configurations were only built.

Hardware distribution summary:

| Board                    | ID   |   Counter |
|--------------------------|------|-----------|
| nrf5340dk_nrf5340_cpuapp |      |         1 |
INFO    - Saving reports...
INFO    - Writing JSON report /Users/kevintownsend/Linaro/zephyr/twister-out/twister.json
INFO    - Writing xunit report /Users/kevintownsend/Linaro/zephyr/twister-out/twister.xml...
INFO    - Writing xunit report /Users/kevintownsend/Linaro/zephyr/twister-out/twister_report.xml...
INFO    - Run completed
microbuilder commented 2 years ago

@nordic-piks Any feedback? Seems trivial to fix all three of these related issues, but not sure where Nordic wants to increase stack size in a manageable way for these tests and across platforms. I'll just make a PR myself to close them if there isn't a reply, and presumably someone will comment there on placement of the stack size adjustments.

cc @carlescufi @joerchan

nordic-piks commented 2 years ago

@microbuilder I can verify such PR (increasing stack size) for all platforms mentioned in the issue. I will leave decision if this should be the solution for @carlescufi and @joerchan. I thinks that there is still other solution possible - add filtering for this test cases based on platform's stack size, so that those tests will no be executed if platforms does not have enough stack available. Can such value be calculated for all platforms?

joerchan commented 2 years ago

@microbuilder The issue appears to be the NO_OPTIMIZATIONS configuration which causes a very significant increase in the stack usage. I could reproduce this on the mps3_an521_ns board as well so it is not just a Nordic platform issue.

Here you can see the impact this configuration has:

# mps3_an521_ns:
main                : STACK: unused 1380 usage 668 / 2048 (32 %);
# mps3_an521_ns no optimizations:
main                : STACK: unused 915 usage 1133 / 2048 (55 %);

# nrf5340dk_nrf5340_ns:
main                : STACK: unused 1308 usage 740 / 2048 (36 %);
#nrf5340dk_nrf5340_ns no optimizations:
main                : STACK: unused 756 usage 1292 / 2048 (63 %);

I used this diff to get the numbers, I had to disable FPU though since the impact on flash was huge and it didn't compile anymore.

--- a/tests/arch/arm/arm_thread_swap/src/main.c
+++ b/tests/arch/arm/arm_thread_swap/src/main.c
@@ -5,5 +5,10 @@
  */

 #include <ztest.h>
+#include <debug/thread_analyzer.h>
+static void teardown(void *fixture)
+{
+        thread_analyzer_print();
+}

-ZTEST_SUITE(arm_thread_swap, NULL, NULL, NULL, NULL, NULL);
+ZTEST_SUITE(arm_thread_swap, NULL, NULL, NULL, NULL, teardown);

nRF53: west build tests/arch/arm/arm_thread_swap/ -- -DCONFIG_FPU=n -DCONFIG_FPU_SHARING=n -DCONFIG_NO_OPTIMIZATIONS=y -DCONFIG_IDLE_STACK_SIZE=512 -DCONFIG_MAIN_STACK_SIZE=2048 -DCONFIG_THREAD_ANALYZER=y

MPS2: west build -b mps2_an521_ns tests/arch/arm/arm_thread_swap -DEMU_PLATFORM=qemu -DCONFIG_FPU=n -DCONFIG_FPU_SHARING=n -DCONFIG_NO_OPTIMIZATIONS=n -DCONFIG_IDLE_STACK_SIZE=512 -DCONFIG_MAIN_STACK_SIZE=2048 -DCONFIG_THREAD_ANALYZER=y -t run

nordic-piks commented 2 years ago

48776 Fixes the issue, thanks for help.