zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.82k stars 6.6k forks source link

zephyr building TF-M gives Segmentation fault on b_u585i_iot02a/stm32u585xx/ns #80932

Open FRASTM opened 1 day ago

FRASTM commented 1 day ago

Zephyr CI testing reports a build error b_u585i_iot02a/stm32u585xx/ns tests/benchmarks/sched_queues when building the tests/benchmarks/sched_queues on the stm32u585 disco kit (during test campaign of the https://github.com/zephyrproject-rtos/zephyr/pull/79797) whatever the option is -DCONFIG_SCHED_DUMB=y or -DCONFIG_SCHED_SCALABLE=y


FAILED: platform/CMakeFiles/platform_s.dir/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.o 
...
/__w/zephyr/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c: In function 'HAL_DMAEx_List_ReplaceNode_Head':
/__w/zephyr/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c:4719:1: internal compiler error: Segmentation fault
 4719 | }
      | ^

Zephyr version: 4.0.0-rc2 toolchain: zephyr 0.16.8 CMake version: 3.22.1

trusted-firmware-m : 8134106ef9cb3df60e8bd22b172532558e936bd2 or a11cd27905aecc4416cfc85552bfc3b997375056

To Reproduce

Expected behavior

Impact CI failure

Logs and console output

FAILED: platform/CMakeFiles/platform_s.dir/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.o 
...
~/zephyrproject/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c: In function 'HAL_DMAEx_List_ReplaceNode_Head':
~/zephyrproject/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c:4719:1: internal compiler error: Segmentation fault
 4719 | }
      | ^
0x7b24d824251f ???
    ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x7b24d8229d8f __libc_start_call_main
    ../sysdeps/nptl/libc_start_call_main.h:58
0x7b24d8229e3f __libc_start_main_impl
    ../csu/libc-start.c:392
mathieuchopstm commented 1 day ago

This is missing one line of context:

during GIMPLE pass: evrp
~/zephyrproject/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c: In function 'HAL_DMAEx_List_ReplaceNode_Head':
~/zephyrproject/modules/tee/tf-m/trusted-firmware-m/platform/ext/target/stm/common/stm32u5xx/hal/Src/stm32u5xx_hal_dma_ex.c:4719:1: internal compiler error: Segmentation fault
 4719 | }
      | ^
0x72996f44251f ???
        ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x72996f429d8f __libc_start_call_main
        ../sysdeps/nptl/libc_start_call_main.h:58
0x72996f429e3f __libc_start_main_impl
        ../csu/libc-start.c:392
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://github.com/zephyrproject-rtos/sdk-ng/issues> for instructions.

This error is actually a red herring, the true problem seems related to DMA_List_CheckNodesBaseAddresses:

FRASTM commented 10 hours ago

@ahmadstm can you please have a look

mathieuchopstm commented 8 hours ago

The following can be used to reproduce the issue:

Minimal reproduction code derived from HAL ```c #include #include /** * Any value other than 0 and 0xFFFFFFFF causes ICE. */ #define MASK (0xA5A5A5A5) /** * Not marking the function as static prevents it from being inlined. * Without inlining, the ICE doesn't seem to trigger. */ static uint32_t callee(void *a1, void *a2) { uint32_t temp = (((uint32_t)a1 | (uint32_t)a2) & MASK); uint32_t ref = 0U; if (a1 != NULL) { ref = (uint32_t)a1; } else if (a2 != NULL) { ref = (uint32_t)a2; } /* else: ref = 0U */ if (temp != (ref & MASK)) { /* Removing `& MASK` here prevents ICE */ return 1U; } return 0U; } int caller(void *a1, void *a2) { if ((a1 == NULL) || (a2 == NULL)) { return 0; } if (callee(a1, a2) != 0U) { return 1; } return 0; } ```

This code builds with GCC 10.5 and 13.1, but hangs with any version of GCC 11 and 12: see https://godbolt.org/z/hcv7zhhc3 Note: using -O merely makes the segfault happen during dom pass instead of evrp.

This is a known GCC bug - see the following GCC BZ tickets:

(Ironically, three of these reports are 2+ years old but use the same HAL file to trigger the ICE...)

ahmadstm commented 7 hours ago

The HALs in TF-M 2.1.1 are from the old cube FW version, and even the latest HAL version from cube FW 1.6.1 does not fix the problem: [https://github.com/STMicroelectronics/stm32u5xx-hal-driver].

@FRASTM @mathieuchopstm you must integrate a ticket for the HAL team to make the change.

erwango commented 7 hours ago

Summary:

Issue linked to compilation of file HAL_DMAEx_List_ReplaceNode_Head.c using a GCC version comprised between v11.x and v12.x. Issue is seen only with TF-M build config RelWithDebInfo (With MinSizeRel config, this file is not built).

Issue could be fixed by "adding attribute((no_inline)) to DMA_List_CheckNodesBaseAddresses allows the build to pass", see analysis above.