Make "replace" the default policy for scheduling actions in the C target

lf-lang / lingua-franca

Intuitive concurrent programming in any language

https://www.lf-lang.org

Other

223 stars 60 forks source link

Make "replace" the default policy for scheduling actions in the C target #1464

Open petervdonovan opened 1 year ago

petervdonovan commented 1 year ago

This is not so much a bug report as a question about whether the semantics as currently designed is what we really want.

We currently allow scheduled actions to pile up in the microstep dimension instead of having a "last write wins" policy like what we have for ports. The question is about whether this asymmetry between ports and actions is really necessary.

The purpose of microsteps might be

to prevent scheduled events from being dropped just because they conflict with other events;
to allow many iterations of a program to happen as quickly as possible without waiting for physical time to elapse;
to increase parallelism by pipelining several consecutive operations without waiting for physical time to elapse;
to make a conceptually fork-join program as tree-shaped as possible so that chain IDs are seldom necessary. (I believe that one of our key claims is not merely that we have determinism (other frameworks provide that) but that we have determinism in the presence of arbitrary topologies; however, I do not totally understand the benefit when you can always just schedule an event one microstep later.)

So, how useful is use case 1? Can we dispense with it?

Here is why I ask. The piling up of events in the microstep dimension can make it difficult to predict how different scheduled actions will align with each other. The logical time of the scheduled event now depends not only on the current time, the min delay, and the additional delay, but the state of the event queue as well. Furthermore, an approach to determining maximum rates at which logical actions are present that is based on the set of times when they can be present becomes more complicated: If you do not know which/how many microsteps there are when an action can be present, the number of times the action can be present in a time interval is unbounded. I will try to follow up later with more details on the types of conclusions that you cannot draw because of this, but it is a work in progress...

Example program:

target C

main reactor {
    logical action a: int
    logical action b: int
    reaction(startup) -> a {=
        lf_schedule_int(a, 0, 0);
    =}
    reaction(startup) -> a, b {=
        // Based on this reaction body, it looks like a and b will be logically simultaneous,
        // but in fact a will be processed a microstep later. The user must know the state
        // of the event queue in order to be aware of this misalignment.
        lf_schedule_int(a, 0, 1);
        lf_schedule_int(b, 0, 1);
    =}
    reaction(a, b) {=
        if (a->is_present) {
            lf_print("a is present with value %d", a->value);
        }
        if (b->is_present) {
            lf_print("b is present with value %d", b->value);
        }
        lf_print("----------------------------------");
    =}
}

Actual output:

---- Start execution at time Mon Nov  7 13:43:44 2022
---- plus 87511666 nanoseconds.
---- Using 1 workers.
a is present with value 0
b is present with value 1
----------------------------------
a is present with value 1
----------------------------------
---- Elapsed logical time (in nsec): 0
---- Elapsed physical time (in nsec): 183,543

Alternative possible output:

---- Start execution at time Mon Nov  7 13:43:44 2022
---- plus 87511666 nanoseconds.
---- Using 1 workers.
a is present with value 1
b is present with value 1
----------------------------------
---- Elapsed logical time (in nsec): 0
---- Elapsed physical time (in nsec): 183,543

edwardalee commented 1 year ago

This is a very good point. I would support switching to a last writer wins semantics. There are some subtleties, though:

There are two tags involved: Let g_1 be the tag at which lf_schedule is called and g_2 be the proposed tag of the logical action event. Does "last writer wins" mean that both tags have to match? Or that only g_2 has to match?
With physical actions, it is not clear what this would mean because there is no well-defined g_1 when they are asynchronously scheduled. So we will likely be introducing an asymmetry between physical and logical actions.

petervdonovan commented 1 year ago

Does "last writer wins" mean that both tags have to match? Or that only g_2 has to match?

I would want to require that only $g_2$ matches, but maybe that is just because I am coming in with a preconceived notion based on discussion #1307 about how things "should" work in order for the way I think about programs to work. Furthermore, if both tags have to match, then that does not solve the problem that "The logical time of the scheduled event now depends not only on the current time, the min delay, and the additional delay, but the state of the event queue as well."

With physical actions, it is not clear what this would mean because there is no well-defined g_1 when they are asynchronously scheduled. So we will likely be introducing an asymmetry between physical and logical actions.

It sounds like you are assuming that we would care about whether $g_1$ matches? In any case, there are ways to ensure that physical actions do not occur at the same time and cannot override each other, e.g. by requiring that readings of the physical clock be strictly increasing.

edwardalee commented 1 year ago

Sounds rather like the "replace" policy described here: https://www.lf-lang.org/docs/handbook/actions?target=c#action-declaration Perhaps the desired behavior could be obtained with

    logical action a(0, 0, "replace")

cmnrd commented 1 year ago

I would like to add to the discussion that the semantics described by @petervdonovan are exactly the ones implemented in the C++ target. There is an old issue stating that the policy in C++ should be updated to the one used in C: #236. Also note the discussions in this issue (they diverge in a different direction though).

The issue is not fixed yet for two reasons. First, I personally was never really convinced that the current C semantics is a sane default and, second, I did not yet encounter a use-case where it actually mattered.

I would be all for using the replace policy described here as default. It seems more intuitive and can be easily predicted. By this I mean that if I call schedule at tag (t, n)/with delay d, then I know precisely hat the new event is scheduled at tag (t+d, 0). With the current C policy, it could be any tag (t+d, m) and I cannot make any predictions on the value of m as I don't know which other reactions might already have scheduled the action at t+d.

While writing this I also thought of an actual use-case that is not possible to implement using the current C semantics. Imagine if we would want to "unschedule" an action. I think there is an issue somewhere saying that it would be a great feature to be able to actually delete events from the event queue. Here, however, I mean overwriting an events value (e.g. with nullptr) and thus marking it as invalid to the reactions triggered by it.

petervdonovan commented 1 year ago

I'll try to summarize what I understood from what Edward has said and from my conversation with @lhstrh.

It seems like we have three options on the table for how to assign microsteps to events/messages scheduled into the future: elementwise addition of logical times (also discussed in the issue that Christian linked ~~#203~~ #236), the addition described here, and the queueing that is currently implemented for the C target.

I think I've underappreciated the usefulness of queueing since of the three, it seems the most connected to the established work (which I should spend more time reading). Here are some observations that might affect how the three options might be evaluated:

Queueing could come in a few different flavors with different tradeoffs, e.g. with some additional rules to try to keep events simultaneous under certain circumstances. I think Edward alluded to this on Monday.
Although dropping events "all the time" sounds confusing/dangerous (and probably is), it is not necessarily a strictly more ``conservative'' approach if it affects one's ability to bound buffers, statically anticipate transient periods of overutilization, etc.
A distinction made in the determinism paper that I found helpful is determinism vs. predictability. All three options seem like they should be deterministic wrt observers that watch one actor at a time (I don't have a proof), but the three options probably lead to different degrees of predictability. The verbose document that appeared in #1307 tried to use nondeterministic models to salvage some predictability.

elementwise addition	weird addition	queueing
abelian group structure; Z-module	not even a group, but there is a comprehensible algebraic structure	no algebraic structure*
times when objects can be present are unions of parallel affine subspaces***	times when objects can be present are predictable, but an elegant geometric characterization of them is not currently known	times when objects are present are predictable in their first element, but probably not in the second element (the microstep), except in special cases
most techniques discussed in #1307 apply		most techniques discussed in #1307 do not apply
analyses used in dataflow might be relatively difficult		analyses used in dataflow might be applicable to a useful subset of LF
messages/events get dropped sometimes	events get dropped all the time	events only get dropped when multiple reactions write to the same port
scheduling of logical actions is a generalization of message dropping that occurs when multiple reactions write to the same port		scheduling of logical actions is a departure from what happens when multiple reactions write to the same port**
events scheduled at the same time with the same delay will be simultaneous		events scheduled at the same time with the same delay might or might not be simultaneous
microsteps diverge in non-pathological programs that use microsteps	microsteps diverge if there is a Zeno condition
uncertainty about logical times is nondecreasing as one follows execution paths	uncertainty about the first element of the logical times is nondecreasing as one follows an execution path, but uncertainty about microsteps can increase or decrease

* "No algebraic structure" in this context means that the time at which an event is scheduled does not just depend on the current logical time and the delay with which the event is scheduled. It also depends on the state of the event queue. This dependency on global state makes it complex to analyze in the same ways as the other two approaches. ** We could smooth over this inconsistency, I suppose, by adopting a queueing policy when reactions write to the same port. *** EDIT: I guess I should have written "submodules." Of course we know that events will not be present before startup, for example, but when I write "can" or "might" be present, I am talking about an overapproximation.

edwardalee commented 1 year ago

I'm having trouble figuring out what the three options that you mention actually are. Can you clarify? Issue #203 does not seem relevant.

petervdonovan commented 1 year ago

Issue https://github.com/lf-lang/lingua-franca/issues/203 does not seem relevant.

Oops -- fixed.

Can you clarify?

Suppose a reaction executes at tag $g = (t, m)$ and schedules an event $(5 \text{ms}, 0)$ into the future. (This is how I interpret after delays -- the $0$ microstep is implicit). Suppose also that the event is already scheduled at time $(t + 5 \text{ms}, 0)$.

The "elementwise addition" option says that the resulting event should be executed at tag $(t + 5 \text{ms}, m + 0)$.
The "weird addition" option says that the resulting event should be executed at tag $(t + 5 \text{ms}, 0)$.
The "queueing" option says that the resulting event should be scheduled at tag $(t + 5 \text{ms}, 1)$ so that the existing event at time $(t + 5 \text{ms}, 0)$ does not conflict with the new event.

If a reaction instead schedules an event $(0, 1)$ into the future (the current after 0 semantics), then the "elementwise addition" option behaves the same as before, and the "weird addition" option behaves like "elementwise addition" because the first element of $(0, 1)$ is zero, and the "queueing" option behaves the same as before, incrementing the microstep as needed to avoid dropping or replacing an event.

edwardalee commented 1 year ago

I am convinced we should do element-wise addition. Volunteer to fix it in the C target? I'm not so sure whether we should do queueing or replacement when there is a collision. I guess the policies we have actions could specify these.

petervdonovan commented 1 year ago

Okay, that sounds reasonable. If we do elementwise addition then that would mean going "all in" on super dense time, in which case I would argue that queueing in the microstep dimension would defeat the purpose (of microstep predictability). Just to play devil's advocate, elementwise addition (even with the "drop" or "replace" policy which I like) does have downsides that the weird addition (and maybe queueing) do not have:

Uncertainty propagation: "uncertainty about logical times is nondecreasing as one follows execution paths" -- For example, if you do not know exactly when event A happened (including its microstep), you cannot determine exactly when event B, which is triggered by event A with a known delay, will happen. In contrast, if you schedule an event a nanosecond in the future with the weird addition, and you know the time $t$ of A but not the microstep, then you know B will happen at $(t+1, 0)$. This possible feature was mentioned in our conversation on Monday (section 11.1). As discussed in section 11.1, this is related to the usefulness of simultaneity for merging results.
Potential loss of simultaneity: As Christian mentioned here, both the weird addition and queueing option have the nice property that "it can happen that two new logical simultaneous events are scheduled, although the original events were not simultaneous. I don't see anything wrong with that. Actually, this could be seen as an optimization, as simultaneous events are cheaper to handle than two consecutive events." Other advantages of simultaneity might include parallelism even in the absence of LET/relaxation of barrier synchronization and dropping of events (to limit rates of activity). Mathematically, you could argue that simultaneity could become quite rare because some execution paths would diverge along submodules that are embedded in different 1-dimensional subspaces in the ambient 2-dimensional vector space, and such subspaces can only intersect at one point
State space size: In most programs, recurrent events have a chance of happening at an infinite number of possible microsteps, whereas with the other two policies there will be many recurrent events that can only happen at a few different possible microsteps. In the language of the verbose PDF, this happens when their self-loops have time smears whose time differences all have ~~zero~~ any number other than zero as their first element

edwardalee commented 1 year ago

Ok, I think the proposal on the table is this:

If at tag (t, n) we schedule an event with delay d > 0, the intended tag of the event is (t+d, n). If d = 0, the intended tag is (t, n+1).
If an event has previously been scheduled at the intended tag, then it will be replaced.

I like the simplicity of this. It means we could remove the replacement policy from the LF syntax. If we really need it (I don't have use cases), then we could provide a runtime API function get_event(action, time, microstep) that returns NULL if there is no event on the event queue with the specified tag and the event otherwise.

Note that with physical actions, this simpler policy does not incur a risk of missing events because the physical clock is strictly increasing (at least in the C target).

Should we go ahead with this?

marcusrossel commented 1 year ago

I'm confused about the very first post in this thread. In the example:

reaction(startup) -> a, b {=
        // Based on this reaction body, it looks like a and b will be logically simultaneous,
        // but in fact a will be processed a microstep later. The user must know the state
        // of the event queue in order to be aware of this misalignment.
        lf_schedule_int(a, 0, 1);
        lf_schedule_int(b, 0, 1);
=}

... it is stated that "it looks like a and b will be logically simultaneous, but in fact a will be processed a microstep later". Why is that? I would think that since a and b are different actions, their scheduling behaviour is totally independent of each other. [Related discussion on Zulip]

cmnrd commented 1 year ago

@petervdonovan, @edwardalee For me it is difficult to understand why you prefer the "elementwise addition" over the "weird addition". I have argued against the "queueing" in the past and I am fully on board with dropping it. However, to me the "elementwise addition" seems weird and I still find the "weird addition" (as you call it), most intuitive. To me it seems really strange to require that microsteps are monotonic. I will try to explain below why.

My first encounter with super dense time I had when I worked with discrete even simulators for logic circuits. There you typically have events (like a positive edge on a clock signal) that some logic gates react to. If you don't care about modeling the delays of the gates, all the outputs are logically simultaneous. However, you still need to process the gates step by step. First, the ones who directly depend on the input event, then the ones that depend on the outputs produced in the previous step, and so on. These are effectively microsteps and a superdense time is created. However, there is no relation between microsteps at different ticks on the logical timeline. You always start at microstep zero. This concept is hardwired in my brain, at part of the reason why I struggle to understand why we should do this differently.
I think that the "weird addition" that is currently implemented in C++ and Rust (Not sure about TypeScript) has some interesting properties that we would loose otherwise. One of the examples to motivate microsteps, that I think also Edward used, is Newton's cradle. When we model this, the first sphere needs to transfer its momentum instantaneously to the last sphere. However, in a simulation we still need to do this step-wise and hence introduce microsteps. This is similar to the simulation of logic gates above. Imagine now we have a cradle with 5 spheres where each sphere is modeled as a reactor that has a momentum as input, a momentum as output, and a logical action to model delays. The middle spheres would always impose a microstep delay and the outer spheres would use their logical action to model the time it takes until they hit there neighboring sphere back. Let us imagine it takes 1s for a sphere to swing back and forth and the first strike of the left sphere happens at (0, 0). Then the right sphere, receives the momentum at (0, 4) and strikes again at (1s, 4). Then the lest sphere receives the momentum at (1s, 8) and strikes back at (2s, 8). I find it weird that the microstep that is intended to disambiguate the different steps in transfering the momentum is preserved between strikes, but I am willing to accept it. However, things get really strange for me if we imagine simulating two cradles where one has 5 spheres and the other only 4. Under the current semantics, the strikes of both cradles are logically simultaneous. I think this is a very nice property of the model as it also reflects the (near) simultaneity of the events in the physical world. We completely loose this property if we use the elementwise addition.
Generalizing the example from 2., I think we are at risk of fragmenting the event queue with the proposed solution. There is a greater risk that events have different tags because of differences in the microstep. This means that, first, we need to schedule and process more events (which in my experience is the most time consuming part of the runtime), and second, we loose opportunities for executing events in parallel. This was mentioned above by Peter as "Potential loss of simultaneity".
Currently, the only mechanism that we have to set the microstep to a known value is resetting it to zero by scheduling a logical action. This can actually be leveraged if we want to make sure that two events are really processed in parallel no matter how the events are relayed upstream. With the elementwise addition we use this last resort of control over the microsteps. I think this was described by Peter above as "Uncertainty propagation".
Timers and physical actions would still always produce events at microstep zero. In a program that uses microstep delays, this will inevitably lead to a situation where scheduled actions cannot be simultaneous to events produced by timers and actions.

In summary, I think there is a lot that we would loose by using the "elementwise addition", but I could not extract from the discussion above what we would actually gain (other than it is mathematically nicer to express). Could you maybe explain the reasoning behind this and the practical impact of using elementwise addition?

petervdonovan commented 1 year ago

Yes, Christian and I seem to be on the same page about items 3 and 4, and items 2 and 5 both seem like examples where you sort of diverge along different subspaces (span of (1, 4) vs span of (1, 5), and span of (1, 0) vs. something else, respectively), which I agree might not be good.

I think a benefit of elementwise addition that is not merely aesthetic is that fewer events get dropped, since if you schedule something with the same delay into the future from two different microsteps, then they will not collide with each other. But I am not so sure that this is so beneficial, because it will still be possible to drop events in other cases, and because in many cases it is desirable to drop events (in a predictable way) to avoid overutilization.

To Christian's item 1, I agree, it is possible to imagine that it would be most intuitive for microsteps to start at zero. In the benchmarks, and perhaps even in non-pathological LF programs, it might be useful to support iterative programs. To do that you must iterate in the microstep dimension; in that case, the microstep is like the loop variable in a for loop, and it would be odd for it to start at a different place each time. The timely dataflow paper that was discussed on Zulip the other day places a lot of emphasis on this perspective -- although for it to really work, they need many dimensions in their microstep (so that a dimension can be used specifically for the current loop). I think this makes it possible to enforce strong consistency in the presence of iteration, which to my knowledge we cannot do, and which is related to the zero-delay federate cycles that we talked about before.

cmnrd commented 1 year ago

Is the potential dropping of events a real problem? I think it is save to say that we have tested the "weired addition" for 2 years now, and so far I have not encountered a practical example where the dropping of events happened unintentionally. As described elsewhere, I actually consider the dropping a feature that allows to modify and/or "unschedule" events (by overwriting them with an invalid value).

It seams to me like the proposed solution solves a rather academic problem at a relatively high cost.

edwardalee commented 1 year ago

Regarding item 1, the use of microsteps in languages like VHDL, I see this as much more like our "levels" than like our microsteps. And indeed, levels do reset to zero at every stage of time advance.

The Newton's cradle example, however, is a good one and indeed it could be the killer example that justifies the "weird addition." I would be much more comfortable seeing what happens when you actually build an LF model of this. Is there a way to avoid the divergence of microsteps even without the weird addition? I doubt it.

Any volunteer to reduces this example to a concrete LF test case?

lhstrh commented 1 year ago

Here's an attempt at summarizing what we talked about during our meeting.

We need to handle overflow of microsteps regardless of what semantics we choose.
It appears most of us agree that a more sensible default policy is replace (instead of defer or drop). It is the same kind of "last write wins" policy that applies to ports.
Having replace be the default policy for scheduling actions suggests that the implementation of after delays should also conform to it.
For the sake of analyzability, it is useful to make policies visible at the LF level (as opposed to hidden within the target code). We already have syntax for this, but @goens made a suggestion for an alternative syntax, which is elaborated on below.
If we stick with the policies (irrespective of the particular syntax we choose), all targets should implement it.

Existing syntax: <origin> action <name>(<offset>, <spacing>, <policy>) (e.g., logical action a(0, 1ms, "defer"). Proposed syntax: <modifier> <origin> <name>(<offset>, <spacing>) (e.g., deferrable logical action a(0, 1ms)

The idea is that when an action is "deferrable", either this directly implies use of the "defer" policy (equivalent with the old syntax) or it means that the programmer may choose to override the default "replace" behavior and defer on a case-by-case basis for each call to schedule. The latter would offer a bit more flexibility, but it also gives rise to an error condition that needs to be checked at runtime (i.e., what happens when the programmer attempts to use the "defer" mechanism on an action that is not marked as deferrable.

petervdonovan commented 1 year ago

Regarding the question of how to expose the different policies (defer/drop/replace): I still think they might best be implemented in standard library reactors, because I think this approach provides

Simplicity: It does not complicate the language itself with extra features ("if you want PL/I, you know where to find it")
Analyzability: Some analyses that are generally applicable to LF programs would likely be able to draw at least some conclusions about the behavior of the library reactors, even if they do not analyze target code.
Customizability: Implementing the policy in regular LF code makes the queueing of events more explicit than it would be if they were on the event queue. For the use cases that LF targets, it is preferable for parts of the program that affect memory usage and allocation to be explicit. It also exposes the queueing for the user to customize. As we found in our conversations with Magnition, it is sometimes desirable for the user to be able to provide special policies to prevent messages/events from accumulating without bound. They could do this by writing their own versions of the library reactors.

Here is an example:

target C { Build-Type: Debug }

/**
 * Funnel messages from many channels into a single channel using the microstep dimension.
 */
reactor MessageFunnel(fan_in: int(2), buffer_size: int(20)) {
    preamble {=
        // FIXME: Must be kept in sync with buffer_size
        #define BUFFER_SIZE 20
        typedef int buffer[BUFFER_SIZE];
    =}
    input[fan_in] in: int
    output out: int
    state pending: buffer  // Hardcoded buffer size :(
    state queue_start: int(0)
    state size: int
    logical action try_again

    initial mode receiving {
        reaction(in) -> out, reset(emptying_buffer) {=
            int i = 0;
            while (i < self->fan_in) {
                if (in[i]->is_present) {
                    lf_set(out, in[i++]->value);
                    break;
                }
                i++;
            }
            if (enqueue_inputs(in, i, self->fan_in)) lf_set_mode(emptying_buffer);
        =}
    }

    mode emptying_buffer {
        logical action t
        reaction(reset, t) -> t {= lf_schedule(t, 0); =}
        reaction(in) {=
            enqueue_inputs(in, 0, self->fan_in);
        =}
        reaction(reset, t) -> out, reset(receiving) {=
            lf_set(out, self->pending[self->queue_start++]);
            self->queue_start %= self->buffer_size;
            self->size--;
            if (!self->size) lf_set_mode(receiving);
        =}
    }

    method enqueue_inputs(inputs: messagefunnel_in_t**, start: int, end: int): bool {=
        bool enqueued;
        for (int i = start; i < end; i++) {
            if (inputs[i]->is_present) {
                enqueued = true;
                enqueue(inputs[i]->value);
            }
        }
        return enqueued;
    =}

    method enqueue(value: int) {=
        if (self->size == self->buffer_size) {
            lf_print_error_and_exit("Buffer overflow in MessageFunnel.");
        }
        self->pending[(self->queue_start + self->size++) % self->buffer_size] = value;
    =}
}

reactor Stdout {
    input in: int
    reaction (in) {=
        lf_print("%d", in->value);
    =}
}

reactor Count(bank_index: int(0), stop: int(3), step: int(1)) {
    output out: int
    initial mode active {
        logical action a
        state count: int(bank_index)
        reaction(startup, a) -> a {= lf_schedule(a, 0); =}
        reaction(a) -> out, reset(dead) {=
            lf_print("Sending %d", self->count);
            lf_set(out, self->count);
            self->count += self->step;
            if (self->count >= self->stop) lf_set_mode(dead);
        =}
    }
    mode dead { /* GC ME! */ }
}

main reactor {
    counts = new[2] Count(stop(10), step(2))
    funnel = new MessageFunnel(fan_in(2))
    stdout = new Stdout()
    counts.out -> funnel.in
    funnel.out -> stdout.in
}

Here is the output:

---- Start execution at time Tue Dec 20 14:52:37 2022
---- plus 73280136 nanoseconds.
---- Using 2 workers.
Sending 1
Sending 0
0
Sending 2
Sending 3
1
Sending 5
Sending 4
2
Sending 7
Sending 6
3
Sending 9
Sending 8
4
5
6
7
8
9
---- Elapsed logical time (in nsec): 0
---- Elapsed physical time (in nsec): 731,658

edwardalee commented 1 year ago

Nice! This could be generalized using tokens and the vector object you created so that the same code could be used for any data type. However, we have no support for yet for polymorphic types in the C target. I've been thinking that we could have token datatype and that any type would be supported with primitive types getting automatically wrapped in a token.

petervdonovan commented 1 year ago

Another approach would be to use macros, like

#define T int
// Code-generate the type of the self struct, the reaction functions, etc.
#undef T

Doing it this way increases code size because for every type that the generic reactor is instantiated with, you must redefine the generic type T and include the code corresponding to the reactor definition. However, this macro-style approach can also be more efficient and more compatible with compile-time type checking. I am also suspicious of unnecessary reliance on token types because I am under the impression that they abstract away memory management.

I believe that under the hood, C++ uses a macro-style approach like this one while Java's "boxing" is more like this "wrap in a token" idea, and I have heard that C++ and Java faced the considerations that I described here.

I should elaborate on why the claim of efficiency and type checking is relevant here since they only seem to matter if one actually does operations on values that have the generic type. Such operations could be provided as macros or as function pointers. Besides, if you are storing values of a generic type in an array without necessarily doing any operations on them, it might help to know how many bits they have so that you don't have to treat everything as a pointer.

lhstrh commented 8 months ago

We now do have generics in the C target. Maybe we should revisit @petervdonovan's component-based suggestion vis-a-vis the "policy" argument of actions that we currently have. I agree that building a standard library of reactors sounds attractive. We are inching closer to a first release of lingo and could next start to focus on its package management functionality... Suppose we discontinue the "policy" argument, would we then still want to support "minimal spacing"? And should it then just use "replace"?

petervdonovan commented 8 months ago

This discussion has diverged from its original purpose somewhat, and I have not thought about this for a while. However, I think I agree that replacing these special syntax features with library reactors seems like a reasonable thing to put on the roadmap for the long term.

But in the very near term I am not sure I am ready for the bikeshedding that any syntax removals would likely entail.

lhstrh commented 8 months ago

Reading this thread, however, it does seems that everybody agrees that "replace" should be the default (and not "defer").

SheaFixstars commented 7 months ago

I would like to ask for some clarification on this issue. The original issue to my understanding was scheduling an action at the same time but a different tag and how that should be handled. But it appears that this has morphed somewhat in the discussion into the default policy for an action as a whole unless I missed/misread something rather then the handling of the scheduling at the same logical time. Is the current proposal that replace would become the default for the same time? Or as a whole for the action that the last scheduled is the one to run next?

IMO If replace as a whole and not at the same logical time is the case I feel like devs may be a bit blindsided by dropping data for queued events. I believe that dropping data should be something the developer has to explicitly say to do. It is easier to see something is running slow then on why certain actions may not trigger all the time depending on how the drop happens.

Another thought for this is that if the default policy could change while still keeping the syntax the same then would it be appropriate to add a warning for when a dev does not define explicitly what policy is being used to say the current policy and suggest they set an explicit policy?

edwardalee commented 7 months ago

To be clear, in this thread, "replace" is (mostly) used to talk about what happens when two calls to lf_schedule are requesting a future event to occur at exactly the same tag. This is not really the same thing as the policy used to handle min_spacing, which can be either "defer" (the default), "drop", or "replace". These policies kick in when lf_schedule wants to schedule an event with a tag that is too close (in time) to a previously scheduled event for the same action.

I think the consensus above is about what to do when the future tags are identical, rather than what to do when they are too closely spaced. @lhstrh suggests that the consensus in this case is to replace, as we do when a reactor writes to an output port twice at the same tag. I think this answer is defensible, but it is not what the C target currently does. It is not a high priority for me to change the current behavior since it is fairly easy for a programmer to avoid this situation.

When a min_spacing is specified, then the situation is much more complicated. First, if min_spacing means that no two events can be closer (in time, ignoring micro steps) than some positive number, then this is really quite expensive and complicated to implement. In principle, it would require searching the entire event queue for events for the same action and determining whether any of them is too close to the one being proposed. If one is too close, then it's not clear what "replace" should mean. Should the time of the remaining event be that of the previously scheduled one or that of the newly scheduled one? Also, which one should be replaced if there are two that are too close? If the replacement is at the newly proposed time, then it will still be too close to one of them, so the replacement would be have to be at the time of the previously scheduled events. This is really ugly.

What I've implemented in much simpler and will prove useful if lf_schedule is used to schedule monotonically increasing events (in time). Specifically, when lf_schedule is called on an action with intended time $t_2$, and the most recent previous call to lf_schedule scheduled an event at time $t_1$, and $t_2 - t_1 <$ min_spacing, then if the policy is "replace", then the payload of the event at time $t_1$ will be replaced with the new payload. Its time will remain at $t_1$.

An alternative might be for the event at time $t_1$ to be deleted and the new event scheduled at $t_2$. However, this can lead to a situation where no event ever gets processed. Suppose the next call to lf_schedule occurs at $t_3$ and $t_3$ is too close to $t_2$. Then the previous event will be replaced by a new one at $t_3$. But that one might be replaced by one at $t_4$, and such replacements could go on indefinitely without any event ever getting processed. I seriously doubt any application designer would want this.

cmnrd commented 7 months ago

My impression is that we might want to give users better feedback and control over the policy. I think @SheaFixstars has a good point about blindsiding devs by just dropping data. Maybe it would be a good idea to make the default policy drop and have lf_schedule return a boolean value that indicates whether the event was scheduled or not. By also adding additional API methods for deleting, overwriting or deferring events, we could leave it up to the user to decide how to handle conflicting events and to implement their own policy.