Open JanStaschulat opened 3 years ago
It think that the sporadic scheduler is overly complex and I had always planned to redesign that scheduler. If someone is interested in that redesign, I would be happy to share my thoughts.
* When I configure the application with one sporadic thread and one FIFO thread (with lower priority and 100% CPU utilization), budget enforcement works well.
I think in this case, to observe the full range of behaviors, you would need a 3 threads. The two that you are using now:
And also:
Without this high priority thread, it might be the case the sporadic scheduler is just broken and has nothing to do with two sporadic threads.
When I tested this many years ago, I added GPIO outputs from the OS task scheduler hooks. Then I could see the task behavior on a logic analyzer. I don't think anyone has used the sporadic scheduler since then and the case with it could suffer from bit rot or is not properly verified.
There are two known issues with the sporadic scheduler in the top-level TODO list:
You don't mention what the failure behavior is. You say "budget enforcement does not work". It might be helpful to know what you mean by that.
Thanks for your quick feedback. I ran a couple of experiments with this test program
Some links to the source code:
The results show, that NuttX does not schedule two sporadic threads according to the specified budgets:
Experimental Results:
Setup:
config:
thread 1:
- SCHED_SPORADIC,
- prio high 180, prio low 20,
- budget 10ms, period 100ms
- max replenishments = 100
thread 2:
- FIFO-thread,
- prio: 120
thread 3:
- SCHED_SPORADIC,
- prio high 179, prio low 19,
- budget 30ms, period 100ms
- max replenishments = 100
-
Hardware:
- Olimex board (STM32),
- NuttX OS
Experiment:
- callback function in each thread has a busy_loop of 1ms and increments a counter
- experiment runs for 10 seconds
- at the end the counter values of all threads are reported, e.g. the number of milliseconds
the thread could execute in interval of 10 seconds (total 10000 milliseconds)
Exp 1: (one sporadic thread and FIFO thread)
configuration with
- thread 1: sporadic thread with budget = x ms and period=100ms
- thread 2: low-prio FIFO thread
config result result
sporadic 1 sporadic 1 fifo
budget(ms) (ms) (ms)
---------------------------------
0 96 9815
10 1074 8837
20 2043 7868
30 3014 6896
40 3985 5925
50 4956 4953
60 5920 3990
70 6899 3010
80 7870 2039
90 8804 1105
100 9910 0
Exp 2 (two sporadic threads and FIFO thread)
Keep sporadic thread 2 with 30/100ms budget/period, vary budget of thread 1 from 0 - 100ms
configuration with
- thread 1: sporadic thread with budget = x ms and period=100ms, prio see above
- thread 3: sporadic thread with budget = 30 ms and period=100ms, prio see above
- thread 2: low-prio FIFO thread, prio see above
config result result result
sporadic 1 sporadic 1 sporadic 2 fifo
budget(ms) (ms) (ms) (ms)
------------------------------------------
0 145 981 8784
10 1073 971 7864
20 2044 10 7854
30 3013 0 6895
40 9909 0 0
50 9909 0 0
60 9909 0 0
70 9909 0 0
80 9908 0 0
90 9908 0 0
100 9909 0 0
Exp 3 (two sporadic threads and FIFO thread)
Keep sporadic thread 1 with 30/100ms budget, vary budget of thread 2 from 0 - 100ms
configuration with
- thread 1: sporadic thread with budget = 30 ms and period=100ms, prio see above
- thread 3: sporadic thread with budget = x ms and period=100ms, prio see above
- thread 2: low-prio FIFO thread, prio see above
config result result result
sporadic 2 sporadic 1 sporadic 2 fifo
budget(ms) (ms) (ms) (ms)
----------------------------------------
0 5246 4661 0
10 7091 2816 0
20 9132 776 0
30 3015 0 6892
40 3016 9 6883
50 3015 49 6844
60 3016 2311 4581
70 3065 2484 4359
80 3015 48 6845
90 3015 91 6802
100 3053 4726 2128
Exp 4 (two sporadic threads and FIFO thread)
Same as Experiment 1, but both sporadic threads with the same priority settings
- sporadic 1: high prio 180, low prio 20
- sporadic 2: high prio 180, low prio 20
- fifo : prio 120
config result result result
sporadic 1 sporadic 1 sporadic 2 fifo
budget(ms) (ms) (ms) (ms)
------------------------------------------
0 144 981 8784
10 1073 971 7864
20 2044 10 7854
30 3015 0 6892
40 39 3044 6825
50 4957 4950 0
60 4427 1559 3923
70 6880 3028 0
80 7840 2066 0
90 8802 1105 0
100 9861 47 0
Example raw output:
Sporadic thread 1: prio high: 180, low: 20, budget: 10000000
pthread_create: budget 0 s 10000000 ns ticks: 10 , period 0 s 100000000 ns ticks 100
thread id 8
sporadic thread 2: at prio high 179 low: 19, budget: 30000000
pthread_create: budget 0 s 30000000 ns ticks: 30 , period 0 s 100000000 ns ticks 100
thread id 9
FIFO thread: prio 120
thread id 10
Result: sporadic 1 1074 ms sporadic 2 19 FIFO 8816 ms
Discussion:
Will this bug be fixed?
Will this bug be fixed?
Apache projects do not have the kind of project organization that can answer that question. The bug will be fixed if some individual in the community decides to work on it as a contribution. No one is offering that now.
As a starting point, I will clean up your test example and incorporate it into the OS test. It found an important bug so it is of value and should be a part of the test. I'll also create some sporadic configuration to exercise the test and replicate your bug.
I have incorporated a modified version of your test case into the OS test. The is #apache/incubator-nuttx/3097 and #apache/incubator-nuttx-apps/620
Here is some sample output (using your priorities):
user_main: Dual sporadic thread test
Sporadic 1: prio high 180, low 20, repl 100000000 ns
Sporadic 2: prio high 180, low 20, repl 100000000 ns
1 Sporadic 1 budget 000000000 ns 58438 ms
Sporadic 2 budget 030000000 ns 41757 ms
2 Sporadic 1 budget 010000000 ns 58449 ms (essentially the same as a budget of zero).
Sporadic 2 budget 030000000 ns 41747 ms
3 Sporadic 1 budget 020000000 ns 91854 ms
Sporadic 2 budget 030000000 ns 8352 ms
4 Sporadic 1 budget 030000000 ns 100208 ms
Sporadic 2 budget 030000000 ns 0 ms
5 Sporadic 1 budget 040000000 ns 8451 ms
Sporadic 2 budget 030000000 ns 91755 ms
6 Sporadic 1 budget 050000000 ns 58417 ms
Sporadic 2 budget 030000000 ns 41779 ms
NOTE:
These values are very consistent from run to run in my current setup but probably differ in other situations.
Budget values above 50 MS would exceed the maximum of half of the replenishment interval and would not be expected to work with any accuracy.
Although there are some failures, in general it looks better than the values that you reported above. The only functional difference (with the above test) is that I did remove the FIFO nuisance thread that you claimed was not necessary.
Each test case is 100,000 MS total. Expected results:
BUDGETS EXPECTED ACTUAL RESULT
1. sporadic 1 budget 0% : >= 0 MS 58438 OK
sporadic 2 budget 30% : >= 30,000 MS 41757 OK
2. sporadic 1 budget 10% : >= 10,000 MS 58449 OK
sporadic 2 budget 30% : >= 30,000 MS 41747 OK
3. sporadic 1 budget 20% : >= 20,000 MS 91854 OK
sporadic 2 budget 30% : >= 30,000 MS 8352 FAIL!!!
4. sporadic 1 budget 30% : >= 30,000 MS 100208 OK (but used ALL of the interval)
sporadic 2 budget 30% : >= 30,000 MS 0 FAIL!!!
5. sporadic 1 budget 40% : >= 40,000 MS 8451 FAIL!!!
sporadic 2 budget 30% : >= 30,000 MS 91755 OK
6. sporadic 1 budget 50% : >= 50,000 MS 58417 OK
sporadic 2 budget 30% : >= 30,000 MS 41779 OK
I believe that this may be largely an artifact of the identical priorities for the two sporadic threads. Consider this priority change:
user_main: Dual sporadic thread test
Sporadic 1: prio high 180, low 20, repl 100000000 ns
Sporadic 2: prio high 170, low 30, repl 100000000 ns
1 Sporadic 1 budget 000000000 ns 8348 ms
Sporadic 2 budget 030000000 ns 91853 ms
2 Sporadic 1 budget 010000000 ns 16707 ms
Sporadic 2 budget 030000000 ns 83495 ms
3 Sporadic 1 budget 020000000 ns 25064 ms
Sporadic 2 budget 030000000 ns 75142 ms
4 Sporadic 1 budget 030000000 ns 33422 ms
Sporadic 2 budget 030000000 ns 66785 ms
5 Sporadic 1 budget 040000000 ns 41777 ms
Sporadic 2 budget 030000000 ns 58429 ms
6 Sporadic 1 budget 050000000 ns 50125 ms
Sporadic 2 budget 030000000 ns 50081 ms
Expected results:
BUDGETS EXPECTED ACTUAL RESULT
1. sporadic 1 budget 0% : >= 0 MS 8348 MS OK
sporadic 2 budget 30% : >= 30,000 MS 91853 MS OK
2. sporadic 1 budget 10% : >= 10,000 MS 16707 MS OK
sporadic 2 budget 30% : >= 30,000 MS 83495 MS OK
3. sporadic 1 budget 20% : >= 20,000 MS 25064 MS OK
sporadic 2 budget 30% : >= 30,000 MS 75142 MS OK
4. sporadic 1 budget 30% : >= 30,000 MS 33422 MS OK
sporadic 2 budget 30% : >= 30,000 MS 66785 MS OK
5. sporadic 1 budget 40% : >= 40,000 MS 41777 MS OK
sporadic 2 budget 30% : >= 30,000 MS 58429 MS OK
6. sporadic 1 budget 50% : >= 50,000 MS 50125 MS OK
sporadic 2 budget 30% : >= 30,000 MS 50081 MS OK
The fact that this priority change eliminates the problem still suggests to me that that there is some issue but that just is more subtle than it originally appeared. Some of this is misleading too: By raising thread 2's lower priority to 30, it always runs for most of the replenishment interval. It would be better to have a CPU hog FIFO thread at a priority of about 100. Then neither sporadic thread could run in its lower priority state and we should then see the counts only for the sporadic threads when they are in the higher priority state.
@patacongo thanks for including it in the os-tests.
Yes, I agree, there should be a third thread scheduled with FIFO (like in my test setup) that eats up the remaining cycles. Proposed setup:
user_main: Dual sporadic thread test
Sporadic 1: prio high 180, low 20, repl 100000000 ns
Sporadic 2: prio high 180, low 20, repl 100000000 ns
FIFO : prio 100, (busy loop, which does computation all the time)
Then, a sporadic thread with a budget of e.g. 30 % shall also result in about 30% processing time, and not any value above 30%. I think with this setup you can properly verify the correctness of the sporadic server scheduling algorithm.
@patacongo thanks for including it in the os-tests.
Yes, I agree, there should be a third thread scheduled with FIFO (like in my test setup) that eats up the remaining cycles. Proposed setup:
user_main: Dual sporadic thread test Sporadic 1: prio high 180, low 20, repl 100000000 ns Sporadic 2: prio high 180, low 20, repl 100000000 ns FIFO : prio 100, (busy loop, which does computation all the time)
Then, a sporadic thread with a budget of e.g. 30 % shall also result in about 30% processing time, and not any value above 30%. I think with this setup you can properly verify the correctness of the sporadic server scheduling algorithm.
I did this in a different way: I added two counts, one when the priority is high and one when the priority is low. The high priority count should be equal to the budget. Low priority counts will occur when the CPU is IDLE and has nothing else to do.
Now, I can see the problem more clearly. I will edit this comment and report the results in a few minutes. ... Here are the results of the modified test:
user_main: Dual sporadic thread test Sporadic 1: prio high 180, low 20, repl 100000000 ns Sporadic 2: prio high 170, low 30, repl 100000000 ns
THREAD BUDGET HI MS LO MS
1 Sporadic 1 000000000 8344 0
Sporadic 2 030000000 41757 50092
2 Sporadic 1 010000000 16706 0
Sporadic 2 030000000 41750 41742
3 Sporadic 1 020000000 25063 0
Sporadic 2 030000000 8352 66786
4 Sporadic 1 030000000 33421 0
Sporadic 2 030000000 0 66782
5 Sporadic 1 040000000 41775 0
Sporadic 2 030000000 0 58426
6 Sporadic 1 050000000 50123 0
Sporadic 2 030000000 0 50079
No you can see that the behavior is the same as your original report: The higher priority budget interval is does not occur after thread 1 budget equals or exceeds the thread 2 budget.
The modified test is incubator-nuttx-apps PR 623
PR #3111 corrects some of the problems, but not all:
user_main: Dual sporadic thread test
Sporadic 1: prio high 180, low 20, repl 100000000 ns
Sporadic 2: prio high 170, low 30, repl 100000000 ns
THREAD BUDGET HI MS LO MS
1 Sporadic 1 000000000 8342 0
Sporadic 2 030000000 41749 50095
2 Sporadic 1 010000000 16699 0
Sporadic 2 030000000 41745 41742
3 Sporadic 1 020000000 25056 0
Sporadic 2 030000000 8351 66784
4 Sporadic 1 030000000 33413 0
Sporadic 2 030000000 0 66779
5 Sporadic 1 040000000 41766 0
Sporadic 2 030000000 41733 16687
6 Sporadic 1 050000000 50114 0
Sporadic 2 030000000 41725 8348
It certainly does narrow the problem down to the case where both thread's budget times complete at approximately the same time.
I believe that I understand the problem. It is complex to explain.
That is consistent with the condition we see that causes the failure (i.e., with both budget intervals the same) and with the counting that we see in collected data (no high priority counts). But without any data, it is just a fantasy.
A solution would require additional state information to detect the case that thread 2 was not initially running. There is already a sporadic->suspended that is set to true when the thread is started. However, it is reset to false when thread 2 resumes (actually runs for the first time) so that information is lost.
Here is an improved description of the failure scenario:
I am not quite sure how to fix this.
Today, I planned to add some instrumentation in the form of debug output to a RAM log to analyze this problem. The RAM log is very fast so I did not expect any issues. However, I found that generating a lot of debug output would eliminate the problem. Even generating a small amount debug output caused only some losses in budget.
This is bad in that in that it means there is no simple way to debug the issue. It is good, however, in that it supports the idea that it is a race condition that causes the problem. The primary effect of using the RAM log is very small timing delays.
there should be a third thread scheduled with FIFO (like in my test setup) that eats up the remaining cycles.
I have a hunch that this would eliminate the problem seen in the case where both budgets are 30 MS because I think it would eliminate the condition that leads to the race condition. However, that problem is a real issue so it is good for the time being that this test reveals the problem.
We published a paper using the sporadic scheduler of NuttX in the context of micro-ROS: https://arxiv.org/abs/2105.05590
Oh, i got a question about this, too.
If a sporadic thread is blocked during its high-priority budget, then wake up during its low-priority, the sporadic thread will just execute at low-priority. But the budget is never consumed during one replenishment interval.
But i think we should let the sporadic continues to run at high priority, if its budget is not really consumed and replenishment time is not yet arrived.
I think the problem might be the watchdog. sporadic_budget_start
called watchdog and sporadic_budget_expire
then set to low-priority. Then sporadic_interval_start
is called and sporadic_interval_expire
is called. This means that once the thread is blocked during high-priority, the high priority budget watchdog still consumes the budget.
I come up with an idea that might be useful: Let`s just set the replenishment watch dog. When the replenishment time comes, thread`s budget is replenished no matter how much it is left. Once the sporadic is running, let the tcb->timeslice indicates the budget. For example, replenishment = 5, budget = 2. tcb->timeslice is set to 2 initially. If the 2 budget is consumed, set tcb->timeslice to 0. In this case, if the thread is blocked during high-priority, it still could be rescheduled at a high priority, untill its budget consumed. This would need some modification to the scheduler.
I`m not sure if the modified scheduling can still be called `sporadic schduling`. I`ve also read some other papers, while there is some difference.Here is the list:
1. Scheduling Aperiodic Tasks in Dynamic Priority Systems. This paper described `Dynamic Sporadic Server`, which is a bit different from nuttx. 2 Aperiodic servers in a deadline scheduling environment. 3 QNX doc. This doc also described sporadic scheduling. 4 Aperiodic Task Scheduling for Real-Time Systems This paper described sporadic scheduling under RM situation i suppose.
@patacongo Thanks.
Hi,
I am using NuttX for micro-ROS on STM32 microcontroller on Olimex board. Link to example
which is a very simple extension of the NuttX example for sporadic scheduling in testing/ostest
Test setup:
Observation:
Problem description I came to the conclusion, that budget enforcement of the NuttX sporadic scheduling only works for one sporadic thread. For real applications, I would like to use multiple threads with sporadic scheduling.
Could you please check the implementation and give support?