zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.85k stars 6.61k forks source link

drivers: clock_control: stm32h7 cannot choose system frequency higher than 400MHz #27212

Closed lochej closed 4 years ago

lochej commented 4 years ago

Describe the bug It is not possible to configure a system clock higher than 400MHz on nucleo_h743zi board. It can also be the case for other H7 boards. To achieve 400MHz+ frequencies, you need to use the PLL and set the Kconfig PLL parameters for the clock_control driver.

Moreover OpenOCD fails to complete flashing the device with status 1 and hello world is not shown in the console. In order to flash the device with other clock settings, it is necessary to spam the reset button while calling west flash to connect to the MCU under reset. See why this could happens below.

Fixing PR: #27213

To Reproduce Steps to reproduce the behavior:

  1. Having a nucleo_h7 series board nucleo_h743zi preferably
  2. Create a custom board named nucleo_h743zi_custom:
  3. Modify the nucleo_h743zi_custom_defconfig file to change clock settings: Comment existing settings and override them with:
    # CSI PLL sysclk 480MHz
    CONFIG_CLOCK_STM32_PLL_SRC_CSI=y
    CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=480000000
    CONFIG_CLOCK_STM32_SYSCLK_SRC_PLL=y
    CONFIG_CLOCK_STM32_PLL_M_DIVISOR=1
    CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER=240
    CONFIG_CLOCK_STM32_PLL_P_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_Q_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_R_DIVISOR=2
    CONFIG_CLOCK_STM32_D1CPRE=1
    CONFIG_CLOCK_STM32_HPRE=2
    CONFIG_CLOCK_STM32_D2PPRE1=2
    CONFIG_CLOCK_STM32_D2PPRE2=2
    CONFIG_CLOCK_STM32_D1PPRE=2
    CONFIG_CLOCK_STM32_D3PPRE=2

    or

    # HSI PLL sysclk 480MHz
    CONFIG_CLOCK_STM32_PLL_SRC_HSI=y
    CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=480000000
    CONFIG_CLOCK_STM32_SYSCLK_SRC_PLL=y
    CONFIG_CLOCK_STM32_PLL_M_DIVISOR=4
    CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER=60
    CONFIG_CLOCK_STM32_PLL_P_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_Q_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_R_DIVISOR=2
    CONFIG_CLOCK_STM32_D1CPRE=1
    CONFIG_CLOCK_STM32_HPRE=2
    CONFIG_CLOCK_STM32_D2PPRE1=2
    CONFIG_CLOCK_STM32_D2PPRE2=2
    CONFIG_CLOCK_STM32_D1PPRE=2
    CONFIG_CLOCK_STM32_D3PPRE=2

    or

    # HSE PLL sysclk 480MHz
    CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=480000000
    CONFIG_CLOCK_STM32_PLL_SRC_HSE=y
    CONFIG_CLOCK_STM32_HSE_CLOCK=8000000
    CONFIG_CLOCK_STM32_SYSCLK_SRC_PLL=y
    CONFIG_CLOCK_STM32_HSE_BYPASS=y
    CONFIG_CLOCK_STM32_PLL_M_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER=240
    CONFIG_CLOCK_STM32_PLL_P_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_Q_DIVISOR=2
    CONFIG_CLOCK_STM32_PLL_R_DIVISOR=2
    CONFIG_CLOCK_STM32_HPRE=2
    CONFIG_CLOCK_STM32_D1CPRE=1
    CONFIG_CLOCK_STM32_D2PPRE1=2
    CONFIG_CLOCK_STM32_D2PPRE2=2
    CONFIG_CLOCK_STM32_D1PPRE=2
    CONFIG_CLOCK_STM32_D3PPRE=2
  4. cd ~/zephyrproject/zephyr/samples/hello_world
  5. west build -p always -b nucleo_h743zi_custom
  6. west flash
  7. See error

Expected behavior Board should be flashed without any issues, Hello World should appear in the console and system clock should be set to 480MHz.

Impact Prevents the user from setting whatever clock setting he wants. Could also break some h7 board like mentionned in #26980 .

Logs and console output OpenOCD crash output:

[0/1] Flashing nucleo_h743zi_custom
-- west flash: using runner openocd
-- runners.openocd: Flashing file: /home/dev/eclipse-workspace/hello_world/build/zephyr/zephyr.hex
Open On-Chip Debugger 0.10.0+dev-01341-g580d06d9d-dirty (2020-05-16-15:41)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 1800 kHz
Info : STLINK V2J29M18 (API v2) VID:PID 0483:374B
Info : Target voltage: 0.021961
Error: target voltage may be too low for reliable debugging
Info : stm32h7x.cpu0: hardware has 8 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections
    TargetName         Type       Endian TapName            State       
--  ------------------ ---------- ------ ------------------ ------------
 0* stm32h7x.cpu0      hla_target little stm32h7x.cpu       running

target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08001248 msp: 0x240007d8
Info : Device: STM32H74x/75x
Info : flash size probed value 2048
Info : STM32H flash has dual banks. Bank (0) size is 1024kb, base address is 0x8000000
auto erase enabled
wrote 131072 bytes from file /home/dev/eclipse-workspace/hello_world/build/zephyr/zephyr.hex in 3.093589s (41.376 KiB/s)

target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08001214 msp: 0x240007d8
verified 13852 bytes in 0.333224s (40.595 KiB/s)

FATAL ERROR: command exited with status 1: /home/dev/zephyr-sdk-0.11.3/sysroots/x86_64-pokysdk-linux/usr/bin/openocd -s /home/dev/zephyrproject/zephyr/boards/arm/nucleo_h743zi_custom/support -s /home/dev/zephyr-sdk-0.11.3/sysroots/x86_64-pokysdk-linux/usr/share/openocd/scripts -f /home/dev/zephyrproject/zephyr/boards/arm/nucleo_h743zi_custom/support/openocd.cfg -c init -c targets -c 'reset halt' -c 'flash write_image erase /home/dev/eclipse-workspace/hello_world/build/zephyr/zephyr.hex' -c 'reset halt' -c 'verify_image /home/dev/eclipse-workspace/hello_world/build/zephyr/zephyr.hex' -c 'reset run' -c shutdown
FAILED: zephyr/cmake/flash/CMakeFiles/flash 
cd /home/dev/eclipse-workspace/hello_world/build && /usr/bin/cmake -E env /home/dev/.local/bin/west flash --skip-rebuild
ninja: build stopped: subcommand failed.

Environment (please complete the following information):

Why this bug and Partial fix This bug occurs because some of the APB clocks can exceed the maximum allowed frequencies when PLL is being set as system source clock. In fact, the PLL dividers should be set to maximum prior to setting the PLL as system clock source and then correctly configured to prevent the frequencies to go out of the limits.

About OpenOCD not being able to complete: That must be due to the clock tree being stuck in an unspecified state preventing OpenOCD to complete all of its commands and exit with shutdown command invoked. Because the flash writing operation succeeds in the first place the MCU must have booted the wrong clock configuration and frozen some of its clock buses (APBx for instance) preventing OpenOCD to send commands like reset halt or reset run.

I have a fix that allows the clock config from HSI/CSI/HSE + PLL to be flashed and used at 480MHz without any problems on nucleo_h743zi.

Additional context This problem appear for @Nukersson on his nucleo_h745zi_q board when trying to test #27188 .

KozhinovAlexander commented 4 years ago

Thank you for opening this bug.

I would leave for the first the OpenOCD flash problem outside of this bug:

But OpenOCD fails to flash the device with status 1 and hello world is not shown in the console.
In order to flash the device with other clock settings, it is necessary to spam the reset button
while calling west flash to connect to the MCU under reset.
lochej commented 4 years ago

Thank you for opening this bug.

I would leave for the first the OpenOCD flash problem outside of this bug:

But OpenOCD fails to flash the device with status 1 and hello world is not shown in the console.
In order to flash the device with other clock settings, it is necessary to spam the reset button
while calling west flash to connect to the MCU under reset.

@Nukersson In my case, OpenOCD fails to succed with shutdown command invoked on my board because of the bugged clock configuration which prevents OpenOCD to communicate with the chip once it has executed the bugged code section which can freeze the clock tree in an unspecified state. That could also explain why I have to connect the device under reset to be able to flash another binary program.

So it is not OpenOCD to blame in the first place but it is there to identify that we get the bug.

KozhinovAlexander commented 4 years ago

Thank you for opening this bug. I would leave for the first the OpenOCD flash problem outside of this bug:

But OpenOCD fails to flash the device with status 1 and hello world is not shown in the console.
In order to flash the device with other clock settings, it is necessary to spam the reset button
while calling west flash to connect to the MCU under reset.

@Nukersson In my case, OpenOCD fails to succed with shutdown command invoked on my board because of the bugged clock configuration which prevents OpenOCD to communicate with the chip once it has executed the bugged code section which can freeze the clock tree in an unspecified state. That could also explain why I have to connect the device under reset to be able to flash another binary program.

So it is not OpenOCD to blame in the first place but it is there to identify that we get the bug.

Then it is even better, if we focus on fixing the clock bug. Afterwards we can investigate if OpenOCD/Flashing problem disappears.

lochej commented 4 years ago

Then it is even better, if we focus on fixing the clock bug. Afterwards we can investigate if OpenOCD/Flashing problem disappears.

Exactly, OpenOCD will be part of our tests steps to validate the fixing PR 😄 I have updated the bug description to better show why OpenOCD could fail due to a bad clock configuration.

KozhinovAlexander commented 4 years ago

I am facing same bug with my NUCLEO-H745ZI-Q board. This board is currently placed in master branch of Zephyr's repository. Per default this board uses PLL with HSE and 240MHz/480MHz clock for M4/M7 cores respectively. See the settings in nucleo_h745zi_q board folder.

My observations are:

I. Investigating the buggy merge number:

  1. Compiling aec5f0a3efa41b0102f361c0f66be55daddbf294 from master branch:

    • Compiling samples/hello_world project for M7 core with above settings (PLL, HSE, 480MHz) works properly
    • Flashing the board with OpenOCD works properly
    • Minicom/UART console shows Zephyr's boot message and Hello World! nucleo_h745zi_q_m7 as expected
  2. Compiling 33abbbfd858c9835ae2f9f905a1132be4d187715 from master branch:

    • Compiling samples/hello_world project for M7 core with above settings (PLL, HSE, 480MHz) works properly
    • Flashing the board with OpenOCD works properly
    • Minicom/UART console DOES NOT shows Zephyr's boot message and Hello World! nucleo_h745zi_q_m7 as expected
    • Moreover, changing the settings to the following one by increasing of CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER:
      
      # HSE PLL sysclk 376MHz
      CONFIG_CLOCK_STM32_PLL_SRC_HSE=y
      CONFIG_CLOCK_STM32_HSE_BYPASS=y

CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC=376000000 CONFIG_CLOCK_STM32_SYSCLK_SRC_PLL=y CONFIG_CLOCK_STM32_HSE_CLOCK=8000000

CONFIG_CLOCK_STM32_PLL_M_DIVISOR=1 CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER=94 CONFIG_CLOCK_STM32_PLL_P_DIVISOR=2 CONFIG_CLOCK_STM32_PLL_Q_DIVISOR=2 CONFIG_CLOCK_STM32_PLL_R_DIVISOR=2

CONFIG_CLOCK_STM32_D1CPRE=1 CONFIG_CLOCK_STM32_HPRE=2 CONFIG_CLOCK_STM32_D2PPRE1=2 CONFIG_CLOCK_STM32_D2PPRE2=2 CONFIG_CLOCK_STM32_D1PPRE=2 CONFIG_CLOCK_STM32_D3PPRE=2



lets the system work properly. Further increasing of `CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER` above the value **94** shows proposed bug.

### II. Summary

1. The commit 33abbbfd858c9835ae2f9f905a1132be4d187715 destroys clock for certain values.
2. Due to `CONFIG_CLOCK_STM32_PLL_N_MULTIPLIER` value investigation - we may face some clock value calculation problems in the corresponding driver or as pointed by @lochej reaching not allowed clock settings. 

### III. Proposed solution fo further avoiding of clock bugs

It would be useful to calculate the clock settings at compile time and provide compile time asserts if values are not matching. My suggestion would be use of preprocessor (may be tricky) or some python script. What are yours?
lochej commented 4 years ago

@Nukersson Thanks for your feedback. Looks like you are facing the same issue as I showing that reducing the PLL output clock allows the system to boot the sample correctly. I suggest you try the fixing PR #27213 version to see if it solves this issue.

I agree on having more clock setting asserts. Most of them should be doable by preprocessor, but we won't get as much information as what STM32CubeMX is able to give us. You are basically proposing to recreate the CubeMX clock assertion system in Zephyr. We could do some basic checks and redirect the user to STM32CubeMX for specific clock tree configurations.

KozhinovAlexander commented 4 years ago

@lochej If #27213 is the fix for this bug, please mention it in the bug description above.

lochej commented 4 years ago

Initial implementation of the clock_control for H7 series (prior to #26980) was setting the APBx dividers BEFORE setting the PLL as SYSCLK which prevented the clocks to go above specification once SYSCLK was set to PLL.

This explains why @Nukersson had a working clock tree working prior to #26980 .

That is all my mistake. I have changed in #26980 where the PLL is set as SYSCLK source prior to setting the correct APBx buses. I didn't notice any bug for low frequencies when clocking the system directly from HSI/CSI or HSE and for low frequencies which didn't freeze the buses.

This should be OK in #27213 where I preselect dividers for the APBx buses preventing going over the limits. I could also just set the bus prescalers prior to chosing the SYSCLK clock to get closer to the initial implementation.

xhpohanka commented 4 years ago

I have also faced some problems related with the latest patch for clock settings. If I'm looking right I think that there is overlapped logic in first control assert, isn't it @lochej ? :)

static int32_t get_vco_input_range(uint32_t pllsrc_clock, uint32_t divm)
{
    const uint32_t input_freq = pllsrc_clock/divm;

    __ASSERT(input_freq < 1000000UL || input_freq > 16000000UL,
            "PLL1 VCO frequency input range out of range");

...
lochej commented 4 years ago

I have also faced some problems related with the latest patch for clock settings. If I'm looking right I think that there is overlapped logic in first control assert, isn't it @lochej ? :)

static int32_t get_vco_input_range(uint32_t pllsrc_clock, uint32_t divm)
{
  const uint32_t input_freq = pllsrc_clock/divm;

  __ASSERT(input_freq < 1000000UL || input_freq > 16000000UL,
          "PLL1 VCO frequency input range out of range");

...

Thanks for your comment, Could you be more precise about what's wrong with this assertion ?

PLL input range is [1MHz;16MHz], so everything below 1MHz or above 16MHz is not a valid input.

xhpohanka commented 4 years ago
#define __ASSERT(test, fmt, ...)                                          \
    do {                                                              \
        if (!(test)) {                                            \
            __ASSERT_LOC(test);                               \
            __ASSERT_MSG_INFO(fmt, ##__VA_ARGS__);            \
            __ASSERT_POST_ACTION();                           \
        }                                                         \
    } while (false)

we should assert that we are inside the required range so I think __ASSERT(input_freq >= 1000000UL && input_freq <= 16000000UL, "") would be right. The same applies for control of return value of this function.

lochej commented 4 years ago
#define __ASSERT(test, fmt, ...)                                          \
  do {                                                              \
      if (!(test)) {                                            \
          __ASSERT_LOC(test);                               \
          __ASSERT_MSG_INFO(fmt, ##__VA_ARGS__);            \
          __ASSERT_POST_ACTION();                           \
      }                                                         \
  } while (false)

we should assert that we are inside the required range so I think __ASSERT(input_freq >= 1000000UL && input_freq <= 16000000UL, "") would be right. The same applies for control of return value of this function.

You are absolutely right, that's me not knowing how to use assertions 🙃 There also is another assertion:

vco_input_range = get_vco_input_range(
            pllsrc_clock,
            CONFIG_CLOCK_STM32_PLL_M_DIVISOR);

    __ASSERT(vco_input_range == -ERANGE, "PLL VCO input frequency out of range. Should be from 1 to 16 MHz");

that should be changed to:

__ASSERT(vco_input_range != -ERANGE, "PLL VCO input frequency out of range. Should be from 1 to 16 MHz");

I will try to embed this correction in #27213

lochej commented 4 years ago

III. Proposed solution fo further avoiding of clock bugs

It would be useful to calculate the clock settings at compile time and provide compile time asserts if values are not matching. My suggestion would be use of preprocessor (may be tricky) or some python script. What are yours?

@Nukersson Your idea is great and I am trying to implement it using only preprocessor errors. I'm almost done with them and I'll try to provide this detection to #27213 once it is done.

I'm just facing preprocessor errors due to:

#define HSI_VALUE    ((uint32_t)64000000) /*!< Value of the Internal oscillator in Hz*/
#define CSI_VALUE    ((uint32_t)4000000) /*!< Value of the Internal oscillator in Hz*/

Which prevents the preprocessor from doing any comparison from these defines (break with error: missing binary operator before token). Defines should be for it to work:

#define HSI_VALUE    (64000000UL) /*!< Value of the Internal oscillator in Hz*/
#define CSI_VALUE    (4000000UL) /*!< Value of the Internal oscillator in Hz*/

There are no problems with HSE_VALUE as it is redefined by the build system as @erwango told me.