Open petervdonovan opened 1 year ago
This is a very good point. I would support switching to a last writer wins semantics. There are some subtleties, though:
Does "last writer wins" mean that both tags have to match? Or that only g_2 has to match?
I would want to require that only $g_2$ matches, but maybe that is just because I am coming in with a preconceived notion based on discussion #1307 about how things "should" work in order for the way I think about programs to work. Furthermore, if both tags have to match, then that does not solve the problem that "The logical time of the scheduled event now depends not only on the current time, the min delay, and the additional delay, but the state of the event queue as well."
With physical actions, it is not clear what this would mean because there is no well-defined g_1 when they are asynchronously scheduled. So we will likely be introducing an asymmetry between physical and logical actions.
It sounds like you are assuming that we would care about whether $g_1$ matches? In any case, there are ways to ensure that physical actions do not occur at the same time and cannot override each other, e.g. by requiring that readings of the physical clock be strictly increasing.
Sounds rather like the "replace" policy described here: https://www.lf-lang.org/docs/handbook/actions?target=c#action-declaration Perhaps the desired behavior could be obtained with
logical action a(0, 0, "replace")
I would like to add to the discussion that the semantics described by @petervdonovan are exactly the ones implemented in the C++ target. There is an old issue stating that the policy in C++ should be updated to the one used in C: #236. Also note the discussions in this issue (they diverge in a different direction though).
The issue is not fixed yet for two reasons. First, I personally was never really convinced that the current C semantics is a sane default and, second, I did not yet encounter a use-case where it actually mattered.
I would be all for using the replace policy described here as default. It seems more intuitive and can be easily predicted. By this I mean that if I call schedule at tag (t, n)/with delay d, then I know precisely hat the new event is scheduled at tag (t+d, 0). With the current C policy, it could be any tag (t+d, m) and I cannot make any predictions on the value of m as I don't know which other reactions might already have scheduled the action at t+d.
While writing this I also thought of an actual use-case that is not possible to implement using the current C semantics. Imagine if we would want to "unschedule" an action. I think there is an issue somewhere saying that it would be a great feature to be able to actually delete events from the event queue. Here, however, I mean overwriting an events value (e.g. with nullptr) and thus marking it as invalid to the reactions triggered by it.
I'll try to summarize what I understood from what Edward has said and from my conversation with @lhstrh.
It seems like we have three options on the table for how to assign microsteps to events/messages scheduled into the future: elementwise addition of logical times (also discussed in the issue that Christian linked #203 #236), the addition described here, and the queueing that is currently implemented for the C target.
I think I've underappreciated the usefulness of queueing since of the three, it seems the most connected to the established work (which I should spend more time reading). Here are some observations that might affect how the three options might be evaluated:
elementwise addition |
weird addition | queueing |
---|---|---|
abelian group structure; Z-module | not even a group, but there is a comprehensible algebraic structure | no algebraic structure* |
times when objects can be present are unions of parallel affine subspaces*** | times when objects can be present are predictable, but an elegant geometric characterization of them is not currently known | times when objects are present are predictable in their first element, but probably not in the second element (the microstep), except in special cases |
most techniques discussed in #1307 apply | most techniques discussed in #1307 do not apply | |
analyses used in dataflow might be relatively difficult | analyses used in dataflow might be applicable to a useful subset of LF | |
messages/events get dropped sometimes |
events get dropped all the time | events only get dropped when multiple reactions write to the same port |
scheduling of logical actions is a generalization of message dropping that occurs when multiple reactions write to the same port | scheduling of logical actions is a departure from what happens when multiple reactions write to the same port** | |
events scheduled at the same time with the same delay will be simultaneous | events scheduled at the same time with the same delay might or might not be simultaneous | |
microsteps diverge in non-pathological programs that use microsteps | microsteps diverge if there is a Zeno condition | |
uncertainty about logical times is nondecreasing as one follows execution paths | uncertainty about the first element of the logical times is nondecreasing as one follows an execution path, but uncertainty about microsteps can increase or decrease |
* "No algebraic structure" in this context means that the time at which an event is scheduled does not just depend on the current logical time and the delay with which the event is scheduled. It also depends on the state of the event queue. This dependency on global state makes it complex to analyze in the same ways as the other two approaches.
** We could smooth over this inconsistency, I suppose, by adopting a queueing policy when reactions write to the same port.
*** EDIT: I guess I should have written "submodules." Of course we know that events will not be present before startup
, for example, but when I write "can" or "might" be present, I am talking about an overapproximation.
I'm having trouble figuring out what the three options that you mention actually are. Can you clarify? Issue #203 does not seem relevant.
Issue https://github.com/lf-lang/lingua-franca/issues/203 does not seem relevant.
Oops -- fixed.
Can you clarify?
Suppose a reaction executes at tag $g = (t, m)$ and schedules an event $(5 \text{ms}, 0)$ into the future. (This is how I interpret after delays -- the $0$ microstep is implicit). Suppose also that the event is already scheduled at time $(t + 5 \text{ms}, 0)$.
If a reaction instead schedules an event $(0, 1)$ into the future (the current after 0
semantics), then the "elementwise addition" option behaves the same as before, and the "weird addition" option behaves like "elementwise addition" because the first element of $(0, 1)$ is zero, and the "queueing" option behaves the same as before, incrementing the microstep as needed to avoid dropping or replacing an event.
I am convinced we should do element-wise addition. Volunteer to fix it in the C target? I'm not so sure whether we should do queueing or replacement when there is a collision. I guess the policies we have actions could specify these.
Okay, that sounds reasonable. If we do elementwise addition then that would mean going "all in" on super dense time, in which case I would argue that queueing in the microstep dimension would defeat the purpose (of microstep predictability). Just to play devil's advocate, elementwise addition (even with the "drop" or "replace" policy which I like) does have downsides that the weird addition (and maybe queueing) do not have:
Ok, I think the proposal on the table is this:
If at tag (t, n) we schedule an event with delay d > 0, the intended tag of the event is (t+d, n). If d = 0, the intended tag is (t, n+1).
If an event has previously been scheduled at the intended tag, then it will be replaced.
I like the simplicity of this. It means we could remove the replacement policy from the LF syntax. If we really need it (I don't have use cases), then we could provide a runtime API function get_event(action, time, microstep) that returns NULL if there is no event on the event queue with the specified tag and the event otherwise.
Note that with physical actions, this simpler policy does not incur a risk of missing events because the physical clock is strictly increasing (at least in the C target).
Should we go ahead with this?
I'm confused about the very first post in this thread. In the example:
reaction(startup) -> a, b {=
// Based on this reaction body, it looks like a and b will be logically simultaneous,
// but in fact a will be processed a microstep later. The user must know the state
// of the event queue in order to be aware of this misalignment.
lf_schedule_int(a, 0, 1);
lf_schedule_int(b, 0, 1);
=}
... it is stated that "it looks like a and b will be logically simultaneous, but in fact a will be processed a microstep later". Why is that? I would think that since a
and b
are different actions, their scheduling behaviour is totally independent of each other. [Related discussion on Zulip]
@petervdonovan, @edwardalee For me it is difficult to understand why you prefer the "elementwise addition" over the "weird addition". I have argued against the "queueing" in the past and I am fully on board with dropping it. However, to me the "elementwise addition" seems weird and I still find the "weird addition" (as you call it), most intuitive. To me it seems really strange to require that microsteps are monotonic. I will try to explain below why.
In summary, I think there is a lot that we would loose by using the "elementwise addition", but I could not extract from the discussion above what we would actually gain (other than it is mathematically nicer to express). Could you maybe explain the reasoning behind this and the practical impact of using elementwise addition?
Yes, Christian and I seem to be on the same page about items 3 and 4, and items 2 and 5 both seem like examples where you sort of diverge along different subspaces (span of (1, 4) vs span of (1, 5), and span of (1, 0) vs. something else, respectively), which I agree might not be good.
I think a benefit of elementwise addition that is not merely aesthetic is that fewer events get dropped, since if you schedule something with the same delay into the future from two different microsteps, then they will not collide with each other. But I am not so sure that this is so beneficial, because it will still be possible to drop events in other cases, and because in many cases it is desirable to drop events (in a predictable way) to avoid overutilization.
To Christian's item 1, I agree, it is possible to imagine that it would be most intuitive for microsteps to start at zero. In the benchmarks, and perhaps even in non-pathological LF programs, it might be useful to support iterative programs. To do that you must iterate in the microstep dimension; in that case, the microstep is like the loop variable in a for
loop, and it would be odd for it to start at a different place each time. The timely dataflow paper that was discussed on Zulip the other day places a lot of emphasis on this perspective -- although for it to really work, they need many dimensions in their microstep (so that a dimension can be used specifically for the current loop). I think this makes it possible to enforce strong consistency in the presence of iteration, which to my knowledge we cannot do, and which is related to the zero-delay federate cycles that we talked about before.
Is the potential dropping of events a real problem? I think it is save to say that we have tested the "weired addition" for 2 years now, and so far I have not encountered a practical example where the dropping of events happened unintentionally. As described elsewhere, I actually consider the dropping a feature that allows to modify and/or "unschedule" events (by overwriting them with an invalid value).
It seams to me like the proposed solution solves a rather academic problem at a relatively high cost.
Regarding item 1, the use of microsteps in languages like VHDL, I see this as much more like our "levels" than like our microsteps. And indeed, levels do reset to zero at every stage of time advance.
The Newton's cradle example, however, is a good one and indeed it could be the killer example that justifies the "weird addition." I would be much more comfortable seeing what happens when you actually build an LF model of this. Is there a way to avoid the divergence of microsteps even without the weird addition? I doubt it.
Any volunteer to reduces this example to a concrete LF test case?
Here's an attempt at summarizing what we talked about during our meeting.
replace
(instead of defer
or drop
). It is the same kind of "last write wins" policy that applies to ports.replace
be the default policy for scheduling actions suggests that the implementation of after
delays should also conform to it.Existing syntax: <origin> action <name>(<offset>, <spacing>, <policy>)
(e.g., logical action a(0, 1ms, "defer")
.
Proposed syntax: <modifier> <origin> <name>(<offset>, <spacing>)
(e.g., deferrable logical action a(0, 1ms)
The idea is that when an action is "deferrable", either this directly implies use of the "defer" policy (equivalent with the old syntax) or it means that the programmer may choose to override the default "replace" behavior and defer on a case-by-case basis for each call to schedule
. The latter would offer a bit more flexibility, but it also gives rise to an error condition that needs to be checked at runtime (i.e., what happens when the programmer attempts to use the "defer" mechanism on an action that is not marked as deferrable.
Regarding the question of how to expose the different policies (defer/drop/replace): I still think they might best be implemented in standard library reactors, because I think this approach provides
Here is an example:
target C { Build-Type: Debug }
/**
* Funnel messages from many channels into a single channel using the microstep dimension.
*/
reactor MessageFunnel(fan_in: int(2), buffer_size: int(20)) {
preamble {=
// FIXME: Must be kept in sync with buffer_size
#define BUFFER_SIZE 20
typedef int buffer[BUFFER_SIZE];
=}
input[fan_in] in: int
output out: int
state pending: buffer // Hardcoded buffer size :(
state queue_start: int(0)
state size: int
logical action try_again
initial mode receiving {
reaction(in) -> out, reset(emptying_buffer) {=
int i = 0;
while (i < self->fan_in) {
if (in[i]->is_present) {
lf_set(out, in[i++]->value);
break;
}
i++;
}
if (enqueue_inputs(in, i, self->fan_in)) lf_set_mode(emptying_buffer);
=}
}
mode emptying_buffer {
logical action t
reaction(reset, t) -> t {= lf_schedule(t, 0); =}
reaction(in) {=
enqueue_inputs(in, 0, self->fan_in);
=}
reaction(reset, t) -> out, reset(receiving) {=
lf_set(out, self->pending[self->queue_start++]);
self->queue_start %= self->buffer_size;
self->size--;
if (!self->size) lf_set_mode(receiving);
=}
}
method enqueue_inputs(inputs: messagefunnel_in_t**, start: int, end: int): bool {=
bool enqueued;
for (int i = start; i < end; i++) {
if (inputs[i]->is_present) {
enqueued = true;
enqueue(inputs[i]->value);
}
}
return enqueued;
=}
method enqueue(value: int) {=
if (self->size == self->buffer_size) {
lf_print_error_and_exit("Buffer overflow in MessageFunnel.");
}
self->pending[(self->queue_start + self->size++) % self->buffer_size] = value;
=}
}
reactor Stdout {
input in: int
reaction (in) {=
lf_print("%d", in->value);
=}
}
reactor Count(bank_index: int(0), stop: int(3), step: int(1)) {
output out: int
initial mode active {
logical action a
state count: int(bank_index)
reaction(startup, a) -> a {= lf_schedule(a, 0); =}
reaction(a) -> out, reset(dead) {=
lf_print("Sending %d", self->count);
lf_set(out, self->count);
self->count += self->step;
if (self->count >= self->stop) lf_set_mode(dead);
=}
}
mode dead { /* GC ME! */ }
}
main reactor {
counts = new[2] Count(stop(10), step(2))
funnel = new MessageFunnel(fan_in(2))
stdout = new Stdout()
counts.out -> funnel.in
funnel.out -> stdout.in
}
Here is the output:
---- Start execution at time Tue Dec 20 14:52:37 2022
---- plus 73280136 nanoseconds.
---- Using 2 workers.
Sending 1
Sending 0
0
Sending 2
Sending 3
1
Sending 5
Sending 4
2
Sending 7
Sending 6
3
Sending 9
Sending 8
4
5
6
7
8
9
---- Elapsed logical time (in nsec): 0
---- Elapsed physical time (in nsec): 731,658
Nice! This could be generalized using tokens and the vector object you created so that the same code could be used for any data type. However, we have no support for yet for polymorphic types in the C target. I've been thinking that we could have token
datatype and that any type would be supported with primitive types getting automatically wrapped in a token.
Another approach would be to use macros, like
#define T int
// Code-generate the type of the self struct, the reaction functions, etc.
#undef T
Doing it this way increases code size because for every type that the generic reactor is instantiated with, you must redefine the generic type T
and include the code corresponding to the reactor definition. However, this macro-style approach can also be more efficient and more compatible with compile-time type checking. I am also suspicious of unnecessary reliance on token types because I am under the impression that they abstract away memory management.
I believe that under the hood, C++ uses a macro-style approach like this one while Java's "boxing" is more like this "wrap in a token" idea, and I have heard that C++ and Java faced the considerations that I described here.
I should elaborate on why the claim of efficiency and type checking is relevant here since they only seem to matter if one actually does operations on values that have the generic type. Such operations could be provided as macros or as function pointers. Besides, if you are storing values of a generic type in an array without necessarily doing any operations on them, it might help to know how many bits they have so that you don't have to treat everything as a pointer.
We now do have generics in the C target. Maybe we should revisit @petervdonovan's component-based suggestion vis-a-vis the "policy" argument of actions that we currently have. I agree that building a standard library of reactors sounds attractive. We are inching closer to a first release of lingo
and could next start to focus on its package management functionality... Suppose we discontinue the "policy" argument, would we then still want to support "minimal spacing"? And should it then just use "replace"?
This discussion has diverged from its original purpose somewhat, and I have not thought about this for a while. However, I think I agree that replacing these special syntax features with library reactors seems like a reasonable thing to put on the roadmap for the long term.
But in the very near term I am not sure I am ready for the bikeshedding that any syntax removals would likely entail.
Reading this thread, however, it does seems that everybody agrees that "replace" should be the default (and not "defer").
I would like to ask for some clarification on this issue. The original issue to my understanding was scheduling an action at the same time but a different tag and how that should be handled. But it appears that this has morphed somewhat in the discussion into the default policy for an action as a whole unless I missed/misread something rather then the handling of the scheduling at the same logical time. Is the current proposal that replace would become the default for the same time? Or as a whole for the action that the last scheduled is the one to run next?
IMO If replace as a whole and not at the same logical time is the case I feel like devs may be a bit blindsided by dropping data for queued events. I believe that dropping data should be something the developer has to explicitly say to do. It is easier to see something is running slow then on why certain actions may not trigger all the time depending on how the drop happens.
Another thought for this is that if the default policy could change while still keeping the syntax the same then would it be appropriate to add a warning for when a dev does not define explicitly what policy is being used to say the current policy and suggest they set an explicit policy?
To be clear, in this thread, "replace" is (mostly) used to talk about what happens when two calls to lf_schedule
are requesting a future event to occur at exactly the same tag. This is not really the same thing as the policy used to handle min_spacing
, which can be either "defer" (the default), "drop", or "replace". These policies kick in when lf_schedule
wants to schedule an event with a tag that is too close (in time) to a previously scheduled event for the same action.
I think the consensus above is about what to do when the future tags are identical, rather than what to do when they are too closely spaced. @lhstrh suggests that the consensus in this case is to replace, as we do when a reactor writes to an output port twice at the same tag. I think this answer is defensible, but it is not what the C target currently does. It is not a high priority for me to change the current behavior since it is fairly easy for a programmer to avoid this situation.
When a min_spacing
is specified, then the situation is much more complicated. First, if min_spacing
means that no two events can be closer (in time, ignoring micro steps) than some positive number, then this is really quite expensive and complicated to implement. In principle, it would require searching the entire event queue for events for the same action and determining whether any of them is too close to the one being proposed. If one is too close, then it's not clear what "replace" should mean. Should the time of the remaining event be that of the previously scheduled one or that of the newly scheduled one? Also, which one should be replaced if there are two that are too close? If the replacement is at the newly proposed time, then it will still be too close to one of them, so the replacement would be have to be at the time of the previously scheduled events. This is really ugly.
What I've implemented in much simpler and will prove useful if lf_schedule
is used to schedule monotonically increasing events (in time). Specifically, when lf_schedule
is called on an action with intended time $t_2$, and the most recent previous call to lf_schedule
scheduled an event at time $t_1$, and $t_2 - t_1 <$ min_spacing
, then if the policy is "replace", then the payload of the event at time $t_1$ will be replaced with the new payload. Its time will remain at $t_1$.
An alternative might be for the event at time $t_1$ to be deleted and the new event scheduled at $t_2$. However, this can lead to a situation where no event ever gets processed. Suppose the next call to lf_schedule
occurs at $t_3$ and $t_3$ is too close to $t_2$. Then the previous event will be replaced by a new one at $t_3$. But that one might be replaced by one at $t_4$, and such replacements could go on indefinitely without any event ever getting processed. I seriously doubt any application designer would want this.
My impression is that we might want to give users better feedback and control over the policy. I think @SheaFixstars has a good point about blindsiding devs by just dropping data. Maybe it would be a good idea to make the default policy drop and have lf_schedule
return a boolean value that indicates whether the event was scheduled or not. By also adding additional API methods for deleting, overwriting or deferring events, we could leave it up to the user to decide how to handle conflicting events and to implement their own policy.
This is not so much a bug report as a question about whether the semantics as currently designed is what we really want.
We currently allow scheduled actions to pile up in the microstep dimension instead of having a "last write wins" policy like what we have for ports. The question is about whether this asymmetry between ports and actions is really necessary.
The purpose of microsteps might be
So, how useful is use case 1? Can we dispense with it?
Here is why I ask. The piling up of events in the microstep dimension can make it difficult to predict how different scheduled actions will align with each other. The logical time of the scheduled event now depends not only on the current time, the min delay, and the additional delay, but the state of the event queue as well. Furthermore, an approach to determining maximum rates at which logical actions are present that is based on the set of times when they can be present becomes more complicated: If you do not know which/how many microsteps there are when an action can be present, the number of times the action can be present in a time interval is unbounded. I will try to follow up later with more details on the types of conclusions that you cannot draw because of this, but it is a work in progress...
Example program:
Actual output:
Alternative possible output: