zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.44k stars 6.4k forks source link

stm32: Shell module sample doesn't work on nucleo_l152re #22078

Closed takumiando closed 4 years ago

takumiando commented 4 years ago

Shell module sample doesn't work on nucleo_l152re with latest Zephyr.

$ west build -p auto -b nucleo_l152re zephyr/samples/subsys/shell/shell_module
$ west flash
[00:00:00.000,000] <err> os: ***** MPU FAULT *****
[00:00:00.000,000] <err> os:   Instruction Access Violation
[00:00:00.000,000] <err> os: r0/a1:  0x00000000  r1/a2:  0x20000580  r2/a3:  0x00000000
[00:00:00.000,000] <err> os: r3/a4:  0x00000000 r12/ip:  0xaaaaaaaa r14/lr:  0xfffffffd
[00:00:00.000,000] <err> os:  xpsr:  0x61000036
[00:00:00.000,000] <err> os: Faulting instruction address (r15/pc): 0xe7ecfc56
[00:00:00.001,000] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:00:00.001,000] <err> os: Fault during interrupt handling

[00:00:00.001,000] <err> os: Current thread: 0x200005d0 (idle)
[00:00:00.070,000] <err> os: Halting system

Environment

takumiando commented 4 years ago

The integration sample also doesn't work. Does this bug depends on board specific implementations?

*** Booting Zephyr OS build zephyr-v2.1.0-1246-g68a235932f93  ***
Running test suite framework_tests
===================================================================
starting test - test_assert
PASS - test_assert
===================================================================
Test suite framework_tests succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL
E: ***** MPU FAULT *****
E:   Instruction Access Violation
E: r0/a1:  0x00000000  r1/a2:  0x00000000  r2/a3:  0x20000184
E: r3/a4:  0x00000000 r12/ip:  0x00000000 r14/lr:  0xfffffffd
E:  xpsr:  0x6100000f
E: Faulting instruction address (r15/pc): 0xbf0306da
E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
E: Fault during interrupt handling

E: Current thread: 0x20000184 (idle)
E: Halting system
takumiando commented 4 years ago

At f144e39ced0afb06375d388a936650f33e9b1ac3, it also doesn't work. Of course I added CONFIG_EEPROM_SHELL=y to prj.conf like this commit message...

erwango commented 4 years ago

@takumiando, I suspect an issue with the clock_control driver. Can you try using following clock settings: CONFIG_CLOCK_STM32_SYSCLK_SRC_HSI=y CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=16000000

maxschuh commented 4 years ago

@takumiando, I suspect an issue with the clock_control driver. Can you try using following clock settings: CONFIG_CLOCK_STM32_SYSCLK_SRC_HSI=y CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=16000000

yes, this seems to solve the problem on my side. Both 'samples/synchronization' and 'samples/philosophers' work with these settings

erwango commented 4 years ago

Further testing shows that we're facing the same issue on nucleo_l053r8. Issue seems linked to setting SYSCLK to 32MHz (whatever the clock source PLL HSE/HSI or direct HSI).

frantony commented 4 years ago

@takumiando

E: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
E: Fault during interrupt handling

E: Current thread: 0x20000184 (idle)
E: Halting system

Please try to turn the board power off and then on just after burning zephyr image. The sample should run successfully after power cycle.

erwango commented 4 years ago

Please try to turn the board power off and then on just after burning zephyr image. The sample should run successfully after power cycle.

@frantony thanks for the hint but this is not how it is supposed to work. Any idea on the possible root cause for this ?

takumiando commented 4 years ago

@erwango, thanks for your idea of workaround! It works successfully.

erwango commented 4 years ago

@takumiando, I've replaced zephyr code with code from STM32Cube using LL API to set up the clock and faced the same issue. So, while it looks correlated with a clock control issue it is not an issue in zephyr clock control driver. Either this is a SoC issue, either this is linked with another part of zephyr code. This will require deeper investigation.

tagunil commented 4 years ago

These 32MHz faults go away after commenting out WFI instruction in arch_cpu_idle(), so I guess it is some weird kind of interaction between sleep mode and debugging. Looks similar to unanswered question here: https://community.st.com/s/question/0D50X00009XkXgd/stm32l1-hardfault-when-returning-from-wfi-only-when-debugger-attached

tagunil commented 4 years ago

Replacing WFI instruction with CPSID i; WFI; CPSIE i sequence also helps somehow, but I have no idea why it is so.

tagunil commented 4 years ago

Looks like FreeRTOS Cortex-M ports actually mask interrupts before going to sleep (e.g. https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/master/portable/GCC/ARM_CM3/port.c#L476). Maybe that should be discussed with ARM maintainers.

erwango commented 4 years ago

Looks like FreeRTOS Cortex-M ports actually mask interrupts before going to sleep (e.g. https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/master/portable/GCC/ARM_CM3/port.c#L476). Maybe that should be discussed with ARM maintainers.

@ioannisg any opinion on this ?

ioannisg commented 4 years ago

@tagunil (FYI @erwango) The interrupt masking you proposed above has the following effect: the system wakes up before the ISR is executed. This is needed, in general, if we wish to execute code after the system is woken up by the interrupt, but before the ISR interrupt is executed. In the case of Zephyr ARM's arch_cpu_idle() there is nothing executed after wfi (the function simply returns) so I don't think this would play any significant role in your case.

What is indeed missing here is a __DSB() instruction before calling WFI - I wonder if you can try that an report your findings.

tagunil commented 4 years ago

@ioannisg I've tested your barriers-related commit from #23436. Unfortunately, that didn't fix the problem.

stephanosio commented 4 years ago

@tagunil Could you check if https://github.com/zephyrproject-rtos/zephyr/pull/23511 fixes the problem?

tagunil commented 4 years ago

@stephanosio Yeah, #23511 works, although I still can't understand why.

ioannisg commented 4 years ago

@takumiando can you check, now, if this can be closed? #23511 is merged.

takumiando commented 4 years ago

@ioannisg It works correctly on my nucleo_l152re ;)

erwango commented 4 years ago

@takumiando could you create an issue to get nucleo_l152re working at full speed now? (reverting #22308)

takumiando commented 4 years ago

@erwango Okay https://github.com/zephyrproject-rtos/zephyr/issues/23762

LQchengdu commented 3 years ago

For me, disable debug-mode by LL is also work fine.

    LL_DBGMCU_DisableDBGSleepMode();
    LL_DBGMCU_DisableDBGStandbyMode();
    LL_DBGMCU_DisableDBGStopMode();