Open diggit opened 5 years ago
Oh, and Argon is running in tickless mode.
Also, argon keeps time of wakeups in ticks. When systick period is clamped to value, which is not aligned to with (multiple of) tick duration, more timing errors raises.
Hi @flit, again, I have project which would benefit from such light RTOS. So I dove into Argon debugging again and found several bugs in tickless mode timing (not tested tick mode). It's all about requests to argon between ticks (aligned times when could scheduler ticks occur). Are you interested in discussion about fixes?
Definitely interested! 😄
Great! At first, I have to say, that systick really sucks for ticksless mode.
Let's have this scenario: 10ms scheduler time quanta Thread A:
Thread B:
Thread B is periodically resumed from some timer IRQ (really does not matter, just asynchronous to tick).
Runtime:
System starts, init is done (abs time = 0ms, 0 ticks) and Thread A
is started.
Thread A
finishes it's job and requests 10ms sleep.Thread B
.Thread B
Thread B
finishes it's job and requests suspend.ar_port_set_timer_delay
is called, timer value is reset and Argon completely looses information, about those 3ms. Next wake-up is scheduled to absolute time 10ms (1 tick).In extreme case when F_async > 1/(kSchedulerQuanta_ms/1000)
(more than 1 async events fit in first kSchedulerQuanta_ms of sleep), ticks will never increment and Thread A
never wakes.
Solution is to take timer current value into account when configuring timer. This would mean configuring timer to 7 ms. Systick is kinda stupid timer and its' value cannot be preset. Only option is to modify Reload value. This involves several issues:
Same time loss happens, when thread runs several ms and in the end, requests sleep. Wakeup time is referenced to moment of this call screwing timing for others.
Same thread config like in issue above.
Runtime: System runs sime time...
Thread A
finishes it's job and requests 40ms sleep.Thread B
.Thread B
Thread B
finishes it's job and requests suspend.The issue is, that scheduler calls ar_kernel_increment_tick_count(elapsed ticks)
where elapsed ticks are 2 and this increment is done more than once.
IMO cleanest solution would be to move handling of all ticking to ar_port.cpp together with tickCount
and probably get rid of missedTickCount
(why do we need it instead of just incrementing ticks even when kernel is locked?).
Note: Issues may not be clear at first, but I've spent several days debugging Argon. Drawn numerous timing diagrams and did some poor man's tracing
I have working tickless mode with precise timing on STM32F303 with 2 chained timers/counters. First one is counting microseconds between kernel ticks (16bit) and second one is counting overflows = ticks (32bit). Wake-up interrupt is connected on value compare on tick counter. Values or Reloads are never altered. Counters are never stopped. When Argon wants time, it always gets value from counters. Not deeply tested, but at first glance working. LED = LED_1, ASYNC = LED_2 Some broken timing could be seen in #10
I'll try to implement timing with single timer without necessity for chaining 2 timers. I have to cleanup my code and will link relevant branch here.
WIP here: https://github.com/diggit/argon-rtos/tree/ticklessFixes
Compare with master: https://github.com/flit/argon-rtos/compare/master...diggit:ticklessFixes
Default ar_port.cpp
still needs conversion. Tick mode was not tested yet.
I am also curious, which version of C++ do you target? Even C++11 has constexpr
for constants, so you don't have to abuse enums for this purpose. (in C++ code obviously)
(Sorry I'm being slow to reply… this weekend I've been totally focused on implementing CMSIS-Pack debug sequence support for pyOCD.)
Timing Thanks a ton for investigating these problems! I've known there are issues with timing in Argon, especially for tickless. The main cause of the timing alignment issue (that affects tick mode too) is that the timing granularity is only whole ticks, so as you've seen, you lose any fraction of a tick if rescheduling asynchronously.
These are where I'd like to take Argon in regards to timing:
(You can see some of these, and a lot more, in doc/argon_todo.txt
.)
I was thinking the timer management should be moved to a separate ar_port_timer
file. This would have to be chip-specific, or even application-specific. While the rest of ar_port.c
works for any Cortex-M device.
I'll take a look at your ticklessFixes branch over the next few days/week. Again, I really appreciate your working on this! 😄
Language stuff Argon nominally targets C99 and C++03. At the time, I would have liked to use C++11, but IAR didn't support it. Now, all 3 main Arm compilers support C11 and C++14. so should be ok to switch. But I'd rather do it as a coherent change instead of piecemeal.
Btw, using enums for integer constants is a style widely used by Apple. Years ago, I used to primarily be a Mac developer, so it just looks natural to me. For related constants, I like how it syntactically groups them together.
Missed ticks
Regarding missedTicks
, you're right it should be removed. Just need to refactor ar_kernel_increment_tick_count()
to only check sleeping threads (rename it ar_kernel_check_sleeping_threads()
), and move the update of tickCount
.
No problem. I've never tried pyOCD and always stick to openocd. Maybe it's time to try something new.
Yeah tick<->ms<->us conversion is done on several places and it would make sense to use just one unit. Also scheduler tick decoupling from timing resolution would make sense. This could benefit from some c++ templating and not introducing a lot of ifdefs.
Have a look at ar_port_f303_chained.cpp I've utilized two timers which are chained in HW (STM32 specific). First one counts microseconds between ticks and when it overflows, increments second counter (tick). Wake-up delay config is just about changing compare value of second one. This requires quite large counters (second one is 32bit wide). In ar_port_f303_single.cpp is used only one timer. Value of timer is never adjusted, just reload value. Some parts would be better atomic, but requires disabling irqs for at least short moment.
Definitely. Well dropping tick mode support is probably not so difficult.
What about moving todos to some issue which can be easily edited/updated even without committing changes?
Well I am using llvm and gcc, both support C++17, but even C++14 would be nice. Current mic of C++ and C is bit dangerous. Eg. C++ class extending C struct max not be compatible. C and C++ compiler may have different opinion about padding inside etc. Imo cleanest solution would be going full C++ internally and exposing C wrappers. (I have projects using C++20 on embedded, but that's different story)
Already ditched them in my branch. Function was remaed ar_kernel_tick_process()
in ma branch, but it is kind of temporary. All ticking was moved to port, even they storage.
Unfortunately IAR only supports C++14 (not even C++11!). Although I'm not using IAR much anymore myself, I'd still like to keep compatibility with the 3 major compilers.
Long ago (like mid-2000s), in a pre-Argon RTOS for ARM7, I did have a full C++ implementation with C wrappers. Unfortunately, this configuration makes it very difficult to manage statically-allocated kernel objects from the C wrappers. It also is the least efficient as far as code-size (you can't place calls to the C++ objects in inline C functions, since you can't import the C++ headers at the C API level).
...very difficult to manage statically-allocated kernel objects from the C wrappers
can you elaborate?
In following code, void int test_double_me_and_add(int num)
can be called from C code.
class Test {
int member {99};
public:
int double_me_and_add(int num) {return num*2 + member;}
};
Test g_t;
extern "C" int test_double_me_and_add(int num) {
return g_t.double_me_and_add(num);
}
It also is the least efficient as far as code-size (you can't place calls to the C++ objects in inline C functions, since you can't import the C++ headers at the C API level).
Well, as long as you compile sources individually and link them afterwards, almost every call in single compilation unit can be inlined. Calls between compilation units can be inlined when function body is in header which can be included other compilation units. If you enable -flto
, inlining can happen between compilation units even when function body is not in header. Modern compilers are really amazing in optimizations. You can even mark functions as constexpr
and calculate some data at compile time.
Is IAR Embedded something differemnt from IAR? Here they say about support of all features of C++17.
Even C++14 would be nice. Please consider migration of Argon to C++, there are so many benefits.
When there is scheduled sleep longer than maximum systick period, sleeping times become imprecise. Sleep times are then multiples of systick maximum period, even the last one which would be probably shorter than that. Problem is this line https://github.com/flit/argon-rtos/blob/master/src/ar_kernel.cpp#L279. Systick is not reconfigured and stays at maximum period until it overshoots or matches nextWakeup time. Sleep is then longer than required.
I'll try to fix it and open PR