zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.12k stars 6.21k forks source link

Program execution takes more time in zephyr 3.6.0 on stm32-h743 controller #75487

Open ashutoshpandey-eaton opened 1 week ago

ashutoshpandey-eaton commented 1 week ago

PROBLEM Program execution takes more time in zephyr 3.6.0 on stm32-h743 controller

Elaboration: Same program execution takes more time in zephyr 3.6.0 when compared with zephyr 3.2.0 on stm32-h743 controller with same default system clock (96MHZ) whereas on stm32-u573 it works fine.

To Reproduce -Remove blinky code in main.c at:zephyr\samples\basic\blinky\src and write the code as written below:

volatile uint32_t current_time_uSec1 = 0;
volatile uint32_t current_time_uSec2 = 0;

    while ( 1 )
{
    current_time_uSec1 = k_cyc_to_us_near32( k_cycle_get_32() );
    for ( volatile uint32_t i = 0; i < 48; i++ )
    {}
    current_time_uSec2 = k_cyc_to_us_near32( k_cycle_get_32() );

    printk("time diff is %d\n", current_time_uSec2 - current_time_uSec1);

    k_msleep( 1000 );
}

Build code using: west build -b nucleo_h743zi ./ --pristine

OBSERVATION: 1.Same code on stm32-h743 controller with zephyr3.6.0 [attached-H7_3_6.PNG] print is coming as 13usec while with zephyr 3.2.0 its 4usec [attached-H7_3_2.PNG] default system clock (96MHZ). WHY SO MUCH DIFFERENCE 13usec and 4usec?? 2.Same code on stm32-u575 controller with zephyr3.6.0 print is coming as 3 usec while with zephyr 3.2.0 its 3 usec.default system clock (160MHZ).

Expected behavior Same code should not take so much of execution time between two zephyr versions.

Impact Performance.

Logs and console output H7_3_6 H7_3_2

Environment (please complete the following information): zephyr-sdk-0.16.5-1 with zephyr 3.6.0 zephyr-sdk-0.15.2 with zephyr 3.2.0

erwango commented 1 week ago

This is likely due to https://github.com/zephyrproject-rtos/zephyr/pull/66524 and the fact that cache activation, since v3.6.0, on STM32H7 and STM32F7, depends on CONFIG_CACHE_MANAGEMENT which is default n. You can enable it in your application. Though, please note that if you're enabling DMA in your drivers (such as SPI, UART, ADC, ..) enabling CONFIG_CACHE_MANAGEMENT requires special care you need to take care of data cache coherency. See https://github.com/zephyrproject-rtos/zephyr/pull/70503 for instance.

erwango commented 1 week ago

@FRASTM We should document this in doc/releases/migration-guide-3.6.rst (as this change was done in v3.6.0) and other people might hit it.

prayassamriya commented 1 week ago

This is likely due to #66524 and the fact that cache activation, since v3.6.0, on STM32H7 and STM32F7, depends on CONFIG_CACHE_MANAGEMENT which is default n. You can enable it in your application. Though, please note that if you're enabling DMA in your drivers (such as SPI, UART, ADC, ..) enabling CONFIG_CACHE_MANAGEMENT requires special care you need to take care of data cache coherency. See #70503 for instance.

Do you mean that Zephyr Adopters has to take care of data cache coherency in case CACHE and DMA is enabled? I thought that both cases should be taken care by respective drivers only.

ashutoshpandey-eaton commented 1 week ago

@benediktibk @FRASTM Why CONFIG_CACHE_MANAGEMENT is not implemented for other architectures example: for Arm® Cortex®-M33 and implemented only for M7? I can see in stm32u5_init(), LL_ICACHE_Enable() is directly called?

benediktibk commented 1 week ago

Probably because nobody implemented it yet. The previously mentioned #66524 is not covering all architectures, it was focused on direct calls to SCB_EnableICache and SCB_EnableDCache. But feel free to open a PR for this, some additional clean-up is definitely welcome.