zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.34k stars 6.33k forks source link

tests/kernel/timer/timer_api failing on several nRF5x SoCs #21082

Closed ioannisg closed 4 years ago

ioannisg commented 4 years ago

Describe the bug tests/kernel/timer/timer_api fails on several nRF SoCs and Dev Kits

To Reproduce Execute tests/kernel/timer/timer_api for the platforms listed above

Expected behavior The test should pass

Impact Concerns about the validity of the RTC timer driver implementation

Screenshots or console output

DEVICE: *** Booting Zephyr OS build v2.1.0-rc2-19-g105e0997f299  ***
DEVICE: Running test suite timer_api
DEVICE: ===================================================================
-- runners.nrfjprog: Board with serial number 960177424 flashed successfully.
DEVICE: starting test - test_time_conversions
DEVICE: PASS - test_time_conversions
DEVICE: ===================================================================
DEVICE: starting test - test_timer_duration_period
DEVICE: PASS - test_timer_duration_period
DEVICE: ===================================================================
DEVICE: starting test - test_timer_period_0
DEVICE: PASS - test_timer_period_0
DEVICE: ===================================================================
DEVICE: starting test - test_timer_expirefn_null
DEVICE: PASS - test_timer_expirefn_null
DEVICE: ===================================================================
DEVICE: starting test - test_timer_periodicity
DEVICE: PASS - test_timer_periodicity
DEVICE: ===================================================================
DEVICE: starting test - test_timer_status_get
DEVICE: PASS - test_timer_status_get
DEVICE: ===================================================================
DEVICE: starting test - test_timer_status_get_anytime
DEVICE: 
DEVICE:     Assertion failed at ZEPHYR_BASE/tests/kernel/timer/timer_api/src/main.c:335: test_timer_status_get_anytime: k_timer_status_get(&status_anytime_timer) == 4 is false
DEVICE: 
DEVICE: FAIL - test_timer_status_get_anytime
DEVICE: ===================================================================
DEVICE: starting test - test_timer_status_sync
DEVICE: PASS - test_timer_status_sync
DEVICE: ===================================================================
DEVICE: starting test - test_timer_k_define
DEVICE: PASS - test_timer_k_define
DEVICE: ===================================================================
DEVICE: starting test - test_timer_user_data
DEVICE: PASS - test_timer_user_data
DEVICE: ===================================================================
DEVICE: starting test - test_timer_remaining_get
DEVICE: PASS - test_timer_remaining_get
DEVICE: ===================================================================
DEVICE: Test suite timer_api failed.
DEVICE: ===================================================================
DEVICE: PROJECT EXECUTION FAILED
1/1 nrf5340_dk_nrf5340_cpuapp tests/kernel/timer/timer_api/kernel.timer          FAILED N/A (device 9.513s)

Environment (please complete the following information):

ioannisg commented 4 years ago

@kl-cruz @nika-nordic please, let me know if you can reproduce this

ioannisg commented 4 years ago

FYI @anangl @pabigot

pabigot commented 4 years ago

My test for accuracy of k_busy_wait shows that invocation on nrf5340_dk_nrf5340_cpuapp delays for twice the requested time.

req meas
1 3.25
2 5
5 11.25
10 21.25
15 31.25
20 41.5
50 101.8
50 100.2
50 101.8
75 152.2
75 152
75 152
100 202.5
100 202.5
100 202.7
1000000 2.014 (s)

I don't have the other boards to inspect. Check the HAL implementation of nrfx_coredep_delay_us on those platforms.

kl-cruz commented 4 years ago

@pabigot could you add executing of this function right before test? It will work only for nRF5340_cpuapp.

#include <hal/nrf_cache.h>
...
nrf_cache_enable(NRF_CACHE);

ref: https://github.com/NordicSemiconductor/nrfx/blob/master/hal/nrf_cache.h#L60

anangl commented 4 years ago

@ioannisg Regarding these two:

  • nrf52810_pca10040 (on the original 52810 HW)
  • nrf52811_pca10056 (on the original 52811 HW)

This is expected, as definitions of these boards are adjusted to give accurate delays on nRF52832 and nRF52840, respectively. See CMakeLists.txt for nrf52810_pca10040 and nrf52811_pca10056.

pabigot commented 4 years ago

nrf_cache_enable(NRF_CACHE);

@kl-cruz This doesn't seem to have any effect.

pabigot commented 4 years ago

Perhaps it's a wait state configuration issue?

ioannisg commented 4 years ago

@ioannisg Regarding these two:

  • nrf52810_pca10040 (on the original 52810 HW)
  • nrf52811_pca10056 (on the original 52811 HW)

This is expected, as definitions of these boards are adjusted to give accurate delays on nRF52832 and nRF52840, respectively. See CMakeLists.txt for nrf52810_pca10040 and nrf52811_pca10056.

@anangl agreed, you 're right on this. I'll update the bug report description

pabigot commented 4 years ago

Is anything happening with this? I'm changing this to medium, and it should probably be high. @carlescufi

u32_t t0 = k_cycle_get_32();
k_busy_wait(1000000);
u32_t dt = k_cycle_get_32() - t0;
printk("1M us is %u ticks = %u us\n", dt, k_cyc_to_us_floor32(dt));

produces 1M us is 66080 ticks = 2016601 us on nrf5340_dk_nrf5340_cpuapp.

carlescufi commented 4 years ago

@ioannisg, @anangl and @kl-cruz can you please follow-up on this one? I am raising the priority to high because as @pabigot mentioned, this is a critical issue for nRF ICs.

nika-nordic commented 4 years ago

I've just check this for nrf52833_pca10100 on master (SHA: 9128200e) and tests pass just fine.

*** Booting Zephyr OS build zephyr-v2.1.0-347-g9128200e6cf3  ***
Running test suite timer_api
===================================================================
starting test - test_time_conversions
PASS - test_time_conversions
===================================================================
starting test - test_timer_duration_period
PASS - test_timer_duration_period
===================================================================
starting test - test_timer_period_0
PASS - test_timer_period_0
===================================================================
starting test - test_timer_expirefn_null
PASS - test_timer_expirefn_null
===================================================================
starting test - test_timer_periodicity
PASS - test_timer_periodicity
===================================================================
starting test - test_timer_status_get
PASS - test_timer_status_get
===================================================================
starting test - test_timer_status_get_anytime
PASS - test_timer_status_get_anytime
===================================================================
starting test - test_timer_status_sync
PASS - test_timer_status_sync
===================================================================
starting test - test_timer_k_define
PASS - test_timer_k_define
===================================================================
starting test - test_timer_user_data
PASS - test_timer_user_data
===================================================================
starting test - test_timer_remaining_get
PASS - test_timer_remaining_get
===================================================================
Test suite timer_api succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL

When it comes to nRF53, in fact one test fails and we are investigating it.

ioannisg commented 4 years ago

@nika-nordic : assuming that you have reproduced the failure on the original SHA For pca10100; can you identify the change that has made the test pass in current master? I wonder if this test is simply fragile and depending on code paths may be either passing or failing, or if some patch has really fixed it for nrf52833.

anangl commented 4 years ago

Regarding nrf5340_dk_nrf5340_cpuapp, as @nika-nordic found during debugging the issue, the problem is in incorrect initialization of the SystemCoreClock variable. The core starts with HFCLK128M divided by 2. And the call to SystemCoreClockUpdate that is done at the end of SystemInit sets this variable properly to 64000000. However, the static initialization of the variable, performed after SystemInit, sets this variable incorrectly to 128000000. A temporary workaround for this issue, until the MDK code is corrected, may be to call SystemCoreClockUpdate at a later stage of the system initialization, for instance in nordicsemi_nrf53_init.

nika-nordic commented 4 years ago

@ioannisg In fact I were not able to reproduce the issue on the original SHA 105e099 for nrf52833. I was using SHA: 64f3473 for hal_nordic (commit which introduces nrf52833_pca10100 to the Zephyr). This is what I get when I run the test locally on nrf52833_pca10100:

*** Booting Zephyr OS build v2.1.0-rc2-19-g105e0997f299  ***
Running test suite timer_api
===================================================================
starting test - test_time_conversions
PASS - test_time_conversions
===================================================================
starting test - test_timer_duration_period
PASS - test_timer_duration_period
===================================================================
starting test - test_timer_period_0
PASS - test_timer_period_0
===================================================================
starting test - test_timer_expirefn_null
PASS - test_timer_expirefn_null
===================================================================
starting test - test_timer_periodicity
PASS - test_timer_periodicity
===================================================================
starting test - test_timer_status_get
PASS - test_timer_status_get
===================================================================
starting test - test_timer_status_get_anytime
PASS - test_timer_status_get_anytime
===================================================================
starting test - test_timer_status_sync
PASS - test_timer_status_sync
===================================================================
starting test - test_timer_k_define
PASS - test_timer_k_define
===================================================================
starting test - test_timer_user_data
PASS - test_timer_user_data
===================================================================
starting test - test_timer_remaining_get
PASS - test_timer_remaining_get
===================================================================
Test suite timer_api succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL
nika-nordic commented 4 years ago

Thank you @anangl for detailed description. To show that this SystemCoreClockUpdate() really helps in case of nRF53, just add the call to this function inside main of tests/kernel/timer/timer_api/src/main.c:

void test_main(void)
{
        SystemCoreClockUpdate();

    timer_init(&duration_timer, duration_expire, duration_stop);
    timer_init(&period0_timer, period0_expire, NULL);
(...)