zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.84k stars 6.6k forks source link

tests: arch: arm: arm_thread_swap fails on stm32g0 or stm32l0 #50709

Closed FRASTM closed 2 years ago

FRASTM commented 2 years ago

Describe the bug running the tests/arch/arm/arm_thread_swap on the nucleo_g071rb or nucleo_l073rz will fail (cortex M0+)

Please also mention any information which could help others to understand the problem you're facing: In the tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c the hard fault occurs after
line 326 z_move_thread_to_end_of_prio_q(_current); in
#if defined(CONFIG_ARMV6_M_ARMV8_M_BASELINE)

To Reproduce Steps to reproduce the behavior:

  1. west build -p auto -b nucleo_g071rb tests/arch/arm/arm_thread_swap
  2. west flash
  3. See error

Logs and console output

*** Booting Zephyr OS build v3.2.0-rc2-233-g88705be89b8d  ***
Running TESTSUITE arm_thread_swap
===================================================================
START - test_arm_syscalls
 PASS - test_arm_syscalls in 0.001 seconds
===================================================================
START - test_arm_thread_swap
E: ***** HARD FAULT *****
E: r0/a1:  0x00000000  r1/a2:  0x20000504  r2/a3:  0x200001b8
E: r3/a4:  0x20000718 r12/ip:  0x89abcdef r14/lr:  0x08007203
E:  xpsr:  0x41000000
E: Faulting instruction address (r15/pc): 0x080009d8
E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
E: Current thread: 0x200001b8 (unknown)
E: Halting system

Environment (please complete the following information):

Additional context looks like https://github.com/zephyrproject-rtos/zephyr/issues/50146

mmahadevan108 commented 2 years ago

@FRASTM can you help narrow down to the commit that introduced the failure.

FRASTM commented 2 years ago

I can identify sha1 e7591615724894a2319ea2c09d7eb21b4263fbf0 where test PASSED (with stm32g071rb and stml32073rz nucleo boards)
and from sha1 6edcee886cf1f4231cc0c826fe0cf9fedf324ba7 the test FAILED (on both boards) (In between it seems that python version changed as on that sha1 we had a build error)

FRASTM commented 2 years ago

Code disassembly at address of hard fault :

     0x080009c0 <+148>: bl  0x80071b8 <z_move_thread_to_end_of_prio_q>
     0x080009c4 <+152>: push    {r4, r5, r6, r7}
     0x080009c6 <+154>: mov r4, r8
     0x080009c8 <+156>: mov r5, r9
     0x080009ca <+158>: push    {r4, r5}
     0x080009cc <+160>: mov r4, r10
     0x080009ce <+162>: mov r5, r11
     0x080009d0 <+164>: push    {r4, r5}
     0x080009d2 <+166>: push    {r0, r1}
     0x080009d4 <+168>: adds    r1, r7, #0
     0x080009d6 <+170>: adds    r0, r5, #0
=> 0x080009d8 <+172>:   ldmia   r0!, {r4, r5, r6, r7}
     0x080009da <+174>: ldmia   r0!, {r4, r5, r6, r7}
     0x080009dc <+176>: mov r8, r4
     0x080009de <+178>: mov r9, r5
     0x080009e0 <+180>: mov r10, r6
     0x080009e2 <+182>: mov r11, r7
     0x080009e4 <+184>: adds    r7, r1, #0
     0x080009e6 <+186>: pop {r0, r1}
     0x080009e8 <+188>: dmb sy

with info reg at this line :

r0             0x0                 0
r1             0x20000504          536872196
r2             0x200001b8          536871352
r3             0x20000718          536872728
r4             0x0                 0
r5             0x0                 0
r6             0x800d7da           134272986
r7             0x20000504          536872196

sp             0x20001f98          0x20001f98 <alt_thread_stack+920>
lr             0x80071ff           134246911
pc             0x80009d8           0x80009d8 <alt_thread_entry+172>
FRASTM commented 2 years ago

It seems that one instruction is resetting R0 ( "mov r0, %0;\n\t") followed by (2 identical) multiple load of 4 registers at address R0 : Mem. address 0x0 (R0) cannot store R4, R5, R6, R7

Suggested patch:

diff --git a/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c b/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
index c13926e104..022bbbad8f 100644
--- a/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
+++ b/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
@@ -344,8 +344,7 @@ static void alt_thread_entry(void)
        "push {r0, r1};\n\t"
        "mov r1, r7;\n\t"
        "mov r0, %0;\n\t"
-       "ldmia r0!, {r4-r7};\n\t"
-       "ldmia r0!, {r4-r7};\n\t"
+       "ldmia r1!, {r4-r7};\n\t"
        "mov r8, r4;\n\t"
        "mov r9, r5;\n\t"
        "mov r10, r6;\n\t"
FRASTM commented 2 years ago

@microbuilder Could you please have a look and give any advice

microbuilder commented 2 years ago

@microbuilder Could you please have a look and give any advice

I don't have that specific board, but I'll try to test this on another M0+.

Looks like that code was added in a4049773ac445cd963faffdc46cd0772aa255d10:

$ git log -L347,+2:'tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c'
commit a4049773ac445cd963faffdc46cd0772aa255d10
Author: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>
Date:   Thu Oct 3 11:17:32 2019 +0200

    tests: arch: arm_thread_swap: extend the test for Cortex-M Baseline

    We rework the arm_thread_swap test, so it can build
    and run for Cortex-M Basline architecture (Cortex-M0,
    Cortex-M0+, and Cortex-M23). In most cases, this rework
    involved re-implementing the code blocks for storing
    and loading the callee-saved registers to and from
    memory. In addition, we skip the verification of
    BASEPRI, and replace it with verifying PRIMASK.

    Signed-off-by: Ioannis Glaropoulos <Ioannis.Glaropoulos@nordicsemi.no>

diff --git a/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c b/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
--- a/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
+++ b/tests/arch/arm/arm_thread_swap/src/arm_thread_arch.c
@@ -262,0 +288,2 @@
+               "ldmia r0!, {r4-r7};\n\t"
+               "ldmia r0!, {r4-r7};\n\t"
microbuilder commented 2 years ago

Haven't been able to reproduce in qemu on the M0, so ordered a board to be able to reliably reproduce.

microbuilder commented 2 years ago

@FRASTM @erwango Just a FYI as something to be aware of on ST's part, but it seems the latest release of openocd is broken with STLink due to a changes in libusb: https://github.com/libusb/libusb/issues/928

With openocd 0.11, you will get a segfault trying to run west debugserver:

The command below is what west debugserver runs beneath the hood

$ /opt/homebrew/bin/openocd -s /Users/kevintownsend/Linaro/zephyr/zephyr/boards/arm/nucleo_g071rb/support -s /Users/kevintownsend/zephyr-sdk-0.15.0/sysroots/arm64-pokysdk-linux/usr/share/openocd/scripts -f /Users/kevintownsend/Linaro/zephyr/zephyr/boards/arm/nucleo_g071rb/support/openocd.cfg -c 'tcl_port 6333' -c 'telnet_port 4444' -c 'gdb_port 3333' '-c init' '-c targets' -c 'reset init'
Open On-Chip Debugger 0.11.0
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
srst_only separate srst_nogate srst_open_drain connect_deassert_srst

Info : clock speed 2000 kHz
zsh: segmentation fault  /opt/homebrew/bin/openocd -s  -s  -f  -c 'tcl_port 6333' -c 'telnet_port 4444

You need to build the latest openocd, such as the following on MacOS:

$ brew unlink open-ocd
$ brew install openocd --HEAD
microbuilder commented 2 years ago

Still debugging this, but was surprised to see different assembly output from GDB depending on if you are looking at C or ASM source (using Zephyr SDK 0.15.0 here). Looking at the patch you propsoed above, it's the same for you.

If you run:

$ west debugserver

Then in another terminal window:

$ arm-zephyr-eabi-gdb -s build/zephyr/zephyr.elf -tui \
  -ex "target remote tcp:localhost:3333" \
  -ex "break arm_thread_arch.c:326" \
  -ex "c" \
  -ex "layout split" 

You can see different assembly before ldmia between the C and ASM views:

Screenshot 2022-10-17 at 16 17 52

Doesn't explain the ldmia issue, but GCC shouldn't be doing anything to optimise inline asm code (as I understand it), although this may GDB misinterpreting the instructions as well.

Update

Functionally, the adds and mov are the same here, so it may just be an artifact of the way the obj code is being decompiled.

I compared the .s and .c.obj outputs as follows, and see the same thing:

$ west build -p auto -b nucleo_g071rb zephyr/tests/arch/arm/arm_thread_swap -- -DCMAKE_C_FLAGS="-save-temps"
$ arm-none-eabi-objdump -d build/CMakeFiles/app.dir/src/arm_thread_arch.c.obj > arm_thread_arch.asm

The c->asm .s file has the two mov instructions, but the asm->obj decompiled .c.obj output has adds.

microbuilder commented 2 years ago

@FRASTM The issue here is that the ldmia instruction is operating on %0:

"mov r0, %0;\n\t"
"ldmia r0!, {r4-r7};\n\t"
"ldmia r0!, {r4-r7};\n\t"

... which is resolving to r5 in your case, which got clobbered at some point and has a value of 0 assigned to it. This needs to be a valid address in memory that we can modify/increment. With the code as is, %0/r5 is set to 0, which causes a hard fault when ldmia tries to execute, writing to the address at %0/r5. The inline code itself is sort of broken, depending on getting lucky that r5 hasn't been clobbered by the time we get here.

The M3/M4/M7, etc., version has the same issue, since it uses %0 as well and there is a lot of code before this inline assembly fires.

Trying to figure out how to rework this, but the test feels broken as is.

microbuilder commented 2 years ago

@FRASTM Please let me know if this works for you: https://github.com/zephyrproject-rtos/zephyr/pull/51366

FRASTM commented 2 years ago

Great ! That fixes the pb for stm32g0 and stm32l0 (arm M0+ ) : successfully tested on nucleo_g071rb and nucleo_l073rz

*** Booting Zephyr OS build zephyr-v3.2.0-709-g16d7d2e28651  ***                
Running TESTSUITE arm_thread_swap                                               
===================================================================             
START - test_arm_syscalls                                                       
 PASS - test_arm_syscalls in 0.001 seconds                                      
===================================================================             
START - test_arm_thread_swap                                                    
 PASS - test_arm_thread_swap in 0.001 seconds                                   
===================================================================             
START - test_syscall_cpu_scrubs_regs                                            
Writing 0xDEADBEEF values into registers                                        
Exit from system call                                                           
 PASS - test_syscall_cpu_scrubs_regs in 0.006 seconds                           
===================================================================             
TESTSUITE arm_thread_swap succeeded                                             

------ TESTSUITE SUMMARY START ------                                           

SUITE PASS - 100.00% [arm_thread_swap]: pass = 3, fail = 0, skip = 0, total = 3s
 - PASS - [arm_thread_swap.test_arm_syscalls] duration = 0.001 seconds          
 - PASS - [arm_thread_swap.test_arm_thread_swap] duration = 0.001 seconds       
 - PASS - [arm_thread_swap.test_syscall_cpu_scrubs_regs] duration = 0.006 secons

------ TESTSUITE SUMMARY END ------                                             

===================================================================             
PROJECT EXECUTION SUCCESSFUL      
gromero commented 2 years ago

@FRASTM Kevin pushed a freshest code to #51366 in case you would like to re-test it.

FRASTM commented 2 years ago

same result : that fixes the pb for stm32g0 and stm32l0 (arm M0+ ) : successfully tested on nucleo_g071rb and nucleo_l073rz