lf-lang / reactor-c

A reactor runtime written in C
Other
10 stars 23 forks source link

GEDF scheduler violates EDF policy #426

Open edwardalee opened 1 month ago

edwardalee commented 1 month ago

The GEDF scheduler is supposed to execute all reactions with smaller (inferred) deadline regardless of level before executing any reactions with a larger deadline or with no deadline. Unfortunately, it does not do that, as shown by the following example (due to @soyerefsane). For some reason, instead of using a single reaction queue, the GEDF scheduler uses a separate reaction queue for each level (anybody know why?). This means that it will only prioritize a reaction with a deadline over another reaction if both reactions are at the same level.

In the following program, at each tag, detetctor2 should always execute before task1. In the current realization, it never executes before task1.

image
target C {
  timeout: 200ms,
  workers: 1
};

reactor Periodic(period:time = 50ms, name:string = "Unnamed") {
  timer trigger(0, period)

  output t: time

  reaction(trigger) -> t {=
    instant_t start_time = lf_time_physical_elapsed();
    lf_set(t, start_time);
    lf_print("%s started at physical time: " PRINTF_TIME, self->name, start_time);
  =} 
}

reactor Probe(dl:time = 2ms, name:string = "Unnamed") {
  input i: time
  output t: time

  reaction (i) -> t {=
    lf_set(t, lf_time_physical_elapsed());
    lf_print("%s started at physical time: " PRINTF_TIME, self->name, lf_time_physical_elapsed());
  =} deadline (dl) {=
    lf_set(t, lf_time_physical_elapsed());
    lf_print("%s VIOLATED DEADLINE at physical time: " PRINTF_TIME, self->name, lf_time_physical_elapsed());
  =}
}

main reactor {
  task1 = new Periodic(period = 50ms, name = "task1")
  detector1 = new Probe(dl = 50ms, name = "detector1")

  task1.t -> detector1.i  
  task2 = new Periodic(period = 50ms, name = "task2")
  detector2 = new Probe(dl = 25ms, name = "detector2")

  task2.t -> detector2.i

  reaction(task1.t, detector2.t) {=
    if (task1.t->is_present && detector2.t->is_present && task1.t->value < detector2.t->value) {
      lf_print_error_and_exit("EDF policy violated. detector2 should execute before task1 when both are triggered.");
    }
  =}
}

Sample output:

bin/PeriodicTask
---- System clock resolution: 1000 nsec
---- Start execution at time Sun May 12 18:30:03 2024
---- plus 75281000 nanoseconds
Environment 0: ---- Intializing start tag
Environment 0: ---- Spawning 1 workers.
task2 started at physical time: 1219000
task1 started at physical time: 1225000
detector2 started at physical time: 1228000
detector1 started at physical time: 1230000
FATAL ERROR: EDF policy violated. detector2 should execute before task1 when both are triggered.
erlingrj commented 1 month ago

The GEDF only prioritizes among reactions on the same level. I think this is intended. If not it would be hard to ensure the correct semantics.

There is this chain optimization which I believe should have enabled the execution of detector2 right after task2 since it is the only reaction enabled by it. So that would be a place to start looking. This is in the function schedule_output_reactions

edwardalee commented 1 month ago

The GEDF only prioritizes among reactions on the same level. I think this is intended. If not it would be hard to ensure the correct semantics.

I do suspect that this was intended, but I don't think it's the right choice. The pattern here illustrates why. Recall that a deadline specifies when a reaction should start. It is meant to be put on quick actuator reactions. It implicitly specifies completion deadlines for upstream reactions, and in fact those will have an "inferred" deadline that they automatically inherit. But notice that in this example, the upstream reaction completes on time, but the actuator fails to actuate on time because a lower priority reaction blocks it because it has a lower level. It makes the inferred deadline almost useless.

The original design of the reaction queue, which GEDF disables, does ensure correct semantics and will correctly execute this program, I think. In that design, the priority is calculated first using the (inferred) deadline. Any reaction with a sooner (inferred) deadline will be ahead in the queue of any reaction with a later deadline. Then it sorts by level. This preserves semantics because all reactions upstream of a reaction with a deadline inherit its deadline (or possibly an earlier deadline if they are upstream of multiple reactions with deadlines). Hence, whole chains have the same (or earlier) deadlines, and precedences are respected.

The GEDF scheduler disables this by using a distinct priority queue for each level. It also uses a distinct mutex for each of these priority queues. I can't see any advantage to this. It certainly doesn't reduce mutex contention because the levels are run in sequence. And it breaks the original intended design.

There is this chain optimization which I believe should have enabled the execution of detector2 right after task2 since it is the only reaction enabled by it. So that would be a place to start looking. This is in the function schedule_output_reactions

Actually, I'm not sure why this optimization in schedule_output_reactions doesn't make this program run correctly. The GEDF scheduler must be somehow also disabling this optimization.

But anyway, this optimization will not solve the problem in general. If there are two downstream reactions, the optimization is not used. The original design, however, can handle this scenario.

I think the right solution is to get rid of the multiple priority queues and multiple mutexes in GEDF and use one only (or more precisely, one per environment).

edwardalee commented 1 month ago

I just realized that deadlines need to be inferred across federates for this to work with federated execution. I suspect that this is not currently done. But probably this should be done anyway...