zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.44k stars 6.4k forks source link

timer: cortex_m_systick: uptime drifting in tickless mode #35073

Closed simonguinot closed 3 years ago

simonguinot commented 3 years ago

Abnormal uptime drifting can been observed in Zephyr when using the cortex_m_systick driver in tickless mode.

The following application code can be used to reproduce and highlight the issue. It runs a "no op" timer at 1000 Hz and writes the system uptime on console every 10 seconds. By printing the console messages (and some timestamps) on a host machine connected through serial, then it is possible to compare the Zephyr uptime with the host machine's clock. Although unreliable, this method is good enough to observe the drifting.

#include <zephyr.h>

#define TIMER1_PERIOD_MS 1
#define TIMER2_PERIOD_MS 10000 /* 10 secs */

static struct k_timer timer1;
static struct k_timer timer2;

static struct k_sem sem;

static void timer1_cb(struct k_timer *timer)
{
        /* No op. */
}

static void timer2_cb(struct k_timer *timer)
{
        k_sem_give(&sem);
}

void main(void)
{
        k_sem_init(&sem, 0, 1);

        k_timer_init(&timer1, timer1_cb, NULL);
        k_timer_init(&timer2, timer2_cb, NULL);

        k_timer_start(&timer1, K_NO_WAIT, K_MSEC(TIMER1_PERIOD_MS));
        k_timer_start(&timer2, K_NO_WAIT, K_MSEC(TIMER2_PERIOD_MS));

        do {
                k_sem_take(&sem, K_FOREVER);
                printk("%d\n", k_uptime_get_32());
        } while (1);
}

Here are the logs when running this application on a board embedding a STM32F070 MCU and an external oscillator / crystal. The console messages are sent over UART. And on the host machine side the minicom program (with the extended timestamps enabled) is used to print them.

[2021-05-10 18:00:36.615] 6
[2021-05-10 18:00:46.680] 10006
[2021-05-10 18:00:56.761] 20006
[2021-05-10 18:01:06.842] 30006
[2021-05-10 18:01:16.906] 40006
[2021-05-10 18:01:26.987] 50006
[2021-05-10 18:01:37.068] 60006
[2021-05-10 18:01:47.149] 70006
[2021-05-10 18:01:57.214] 80006
[2021-05-10 18:02:07.294] 90006
[2021-05-10 18:02:17.375] 100006
... snip ...
[2021-05-10 18:09:50.799] 550006
[2021-05-10 18:10:00.879] 560006
[2021-05-10 18:10:10.960] 570006
[2021-05-10 18:10:21.041] 580006
[2021-05-10 18:10:31.106] 590006
[2021-05-10 18:10:41.187] 600006

After 10 minutes, the Zephyr uptime is approximately 4.5 seconds behind the host machine clock, which is too much given the accuracy of the external oscillator (less than 150 PPM at 24 MHz).

And if I run the same application in the same conditions but with the TICKLESS_KERNEL option disabled and SYS_CLOCK_TICKS_PER_SEC set to 1000, then I get the following output:

[2021-05-10 18:32:38.923] 10
[2021-05-10 18:32:48.924] 10010
[2021-05-10 18:32:58.925] 20010
[2021-05-10 18:33:08.925] 30010
[2021-05-10 18:33:18.910] 40010
[2021-05-10 18:33:28.911] 50010
[2021-05-10 18:33:38.912] 60010
[2021-05-10 18:33:48.913] 70010
[2021-05-10 18:33:58.914] 80010
[2021-05-10 18:34:08.915] 90010
[2021-05-10 18:34:18.915] 100010
[2021-05-10 18:34:28.916] 110010
[2021-05-10 18:34:38.917] 120010
[2021-05-10 18:34:48.918] 130010
[2021-05-10 18:34:58.919] 140010
... snip ...
[2021-05-10 18:42:08.892] 570010
[2021-05-10 18:42:18.893] 580010
[2021-05-10 18:42:28.894] 590010
[2021-05-10 18:42:38.894] 600010

This time, after 10 minutes, the Zephyr uptime is looking much better. We can't conclude to any drifting.

simonguinot commented 3 years ago

He are the logs (STM32F070 MCU, external oscillator, tickless mode) of the test application with the patch proposed in PR #35062 applied:

[2021-05-10 19:08:36.819] 6
[2021-05-10 19:08:46.828] 10006
[2021-05-10 19:08:56.846] 20006
[2021-05-10 19:09:06.863] 30006
[2021-05-10 19:09:16.880] 40006
[2021-05-10 19:09:26.897] 50006
[2021-05-10 19:09:36.914] 60006
[2021-05-10 19:09:46.931] 70006
[2021-05-10 19:09:56.948] 80006
[2021-05-10 19:10:06.965] 90006
[2021-05-10 19:10:16.982] 100006
... snip ...
[2021-05-10 19:17:57.758] 560006
[2021-05-10 19:18:07.791] 570006
[2021-05-10 19:18:17.808] 580006
[2021-05-10 19:18:27.825] 590006
[2021-05-10 19:18:37.842] 600006

After 10 minutes the Zephyr uptime is now 1 second behind the host machine clock (against 4.5 seconds without the patch).

alexanderwachter commented 3 years ago

Hi, I made some tests on an stm32g4. I can observe a similar behavior there.

*** Booting Zephyr OS build zephyr-v2.5.0-3876-g21d1ad376230  ***                                                                                                                                                                           
[00:00:00.005,000] <inf> main: Init done                                                                                                                                                                                                    
[00:00:00.005,000] <inf> main: Timer ISR 0 Timer: 5                                                                                                                                                                                         
[00:00:00.005,000] <inf> main: Sys Timer ISR. Timer: 1000 Uptime: 5                                                                                                                                                                         
[00:00:01.005,000] <inf> main: Timer ISR 1 Timer: 1005                                                                                                                                                                                      
[00:00:01.005,000] <inf> main: Sys Timer ISR. Timer: 2000 Uptime: 1005                                                                                                                                                                      
[00:00:02.005,000] <inf> main: Timer ISR 2 Timer: 2005                                                                                                                                                                                      
[00:00:02.005,000] <inf> main: Sys Timer ISR. Timer: 3000 Uptime: 2005                                                                                                                                                                      
[00:00:03.005,000] <inf> main: Timer ISR 3 Timer: 3005                                                                                                                                                                                      
[00:00:03.005,000] <inf> main: Sys Timer ISR. Timer: 4000 Uptime: 3005                                                                                                                                                                      
[00:00:04.005,000] <inf> main: Timer ISR 4 Timer: 4005                                                                                                                                                                                      
[00:00:04.005,000] <inf> main: Sys Timer ISR. Timer: 5000 Uptime: 4005

....

[00:04:11.960,000] <inf> main: Timer ISR 252 Timer: 251960
[00:04:12.005,000] <inf> main: Sys Timer ISR. Timer: 253045 Uptime: 252005
[00:04:12.960,000] <inf> main: Timer ISR 253 Timer: 252960
[00:04:13.005,000] <inf> main: Sys Timer ISR. Timer: 254045 Uptime: 253005
[00:04:13.959,000] <inf> main: Timer ISR 254 Timer: 253959
[00:04:14.005,000] <inf> main: Sys Timer ISR. Timer: 255045 Uptime: 254005

Code: It uses a HW rimer module directly that definitly counts cycles:

/*
 * Copyright (c) 2020 Alexander Wachter
 *
 * SPDX-License-Identifier: Apache-2.0
 */

#include <zephyr.h>
#include <soc.h>
#include <device.h>
#include <devicetree.h>
#include <irq.h>
#include <stm32_ll_rcc.h>
#include <stm32_ll_tim.h>
#include <drivers/clock_control/stm32_clock_control.h>
#include <drivers/clock_control.h>

#include <logging/log.h>
LOG_MODULE_REGISTER(main);

static uint32_t timer_cnt;

void timer_isr(void* unused)
{
    if(LL_TIM_IsActiveFlag_UPDATE(TIM2)==SET)   //Overflow interrupt
    {
        const uint32_t uptime = k_uptime_get_32();
        LOG_INF("Timer ISR %d Timer: %d",timer_cnt, uptime);
        timer_cnt += 1;
    }
    LL_TIM_ClearFlag_UPDATE(TIM2);  //Clear interrupt flag 
}

void sys_timer_isr(struct k_timer* unused)
{
    const uint32_t uptime = k_uptime_get_32();
    uint32_t time = (uint32_t)(((uint64_t)timer_cnt * 1000000ULL + (uint64_t)LL_TIM_GetCounter(TIM2)) / 1000ULL);
    LOG_INF("Sys Timer ISR. Timer: %d Uptime: %d", time ,uptime);
}

void main(void)
{
    const uint32_t tim_clk = DT_PROP(DT_NODELABEL(rcc), clock_frequency);
    LL_TIM_InitTypeDef init = {
        .Prescaler =  __LL_TIM_CALC_PSC(tim_clk, 1000000U),
        .CounterMode = LL_TIM_COUNTERMODE_UP,
        .Autoreload = 1000000U,
        .ClockDivision = LL_TIM_CLOCKDIVISION_DIV1
    };
    const struct stm32_pclken pclken = {
        .enr = DT_CLOCKS_CELL(DT_NODELABEL(timers2), bits),
        .bus = DT_CLOCKS_CELL(DT_NODELABEL(timers2), bus),
    };
    const struct device *clock = DEVICE_DT_GET(STM32_CLOCK_CONTROL_NODE);
    struct k_timer timer;
    int ret;

    ret = clock_control_on(clock, (clock_control_subsys_t *)&pclken);
    if (ret < 0) {
        LOG_ERR("Could not initialize clock [%d]", ret);
        return;
    }
    if (LL_TIM_Init(TIM2, &init) != SUCCESS) {
        LOG_ERR("Timer Init failed");
    }

    IRQ_CONNECT(DT_IRQN(DT_NODELABEL(timers2)),
            DT_IRQ(DT_NODELABEL(timers2), priority),
            timer_isr, NULL, 0);

    LL_TIM_EnableIT_UPDATE(TIM2);
    LL_TIM_EnableARRPreload(TIM2);
    LL_TIM_EnableCounter(TIM2);

    LOG_INF("Init done");
    irq_enable(DT_IRQN(DT_NODELABEL(timers2)));
    k_timer_init(&timer, sys_timer_isr, NULL);
    k_timer_start(&timer, K_NO_WAIT, K_MSEC(1000));

    while(1) {
        k_msleep(10);
    }
}