Physical actions and delays

cmnrd commented 4 years ago

With #129 the meaning of a delay (or offset) on a physical action changed. Let T denote the current physical time and t the current logical time when the action is scheduled. Let d denote the offset passed to schedule. The previous behavior was to insert a new event with tag T + d (I am omitting the microsteps here). The new behavior implemented in #129 changes this to max(t + d, T).

It occurred to me, that this change has some consequences that at least are unexpected and were originally not carefully considered. See the following examples:

Consider for example, a simple system consisting of a button, a physical action that is scheduled when the button is pressed and a reaction to that action. Pressing the button should not trigger an immediate reaction, but the reaction should be triggered with a 2 second delay (for instance to switch on an LED). How can we build this? The obvious solution would be to simply schedule the physical action with a 2 second delay. But this has not the expected outcome. For all we know, T - t > 2 secs could be true. Then the physical action would be scheduled at max(t + 2 secs, T) = T. However, the 'intend' was to schedule at T + 2 secs. This way of building the system clearly does not work. To model the system correctly, we would need to use a physical action to handle the button press and a logical action to implement the 2 second delay. Is this really what we want?
Consider a distributed system with decentralized control such as the aircraft door example. Imagine, the two buttons in the cockpit are each part of separate federates and both schedule a physical action with a delay of 2 seconds. Let's assume button 1 is pressed first at T1 and since T1 - t1 < 2 secs the action is scheduled with tag t1 + 2 secs. Later, button 2 is pressed at physical time T2 (T1 < T2) and the logical action is scheduled with tag t2 + 2 secs. Can we establish the correct order between the two events? I don't think so. For all we know t2 could be less than t1.
There are such anomalies even for non-federated executions. Consider, for instance, the single button example with a 2 second delay. Apart from the physical action, there are no other actions or timers in the system. This means that logical time only advances after a button is pressed. Let's assume the button is pressed once and a new event with tag t1 + 2 secs is inserted. Shortly after (< 2 secs), the button is pressed again. This will schedule a new event with tag t2 + 2 secs. However, there is no guarantee that logical time advances between the two button presses. In fact, in this example t1 is equal to t2 as the scheduler will wait for physical time to reach t1 + 2 secs before it advances logical time. So the first button press would be masked by the second, and the corresponding reaction would only be triggered once. Only when the second button is pressed with a delay of more than 2 seconds, the reaction will be triggered a second time. We could consider this a feature, but as soon as we add a periodic timer or other sources of events to the system we lose this property. We don't know when exactly logical time advances and thus, it becomes random whether 2 subsequent button presses trigger a single reaction or two reactions.

The reasoning in #129 was, that there is no use-case for an offset on a physical action. None of our tests used this feature. The new behavior, however, allowed a simplified insertion of events received over the network in a distributed execution.

I think that specifying a delay on a physical action actually has a valid use-case. It is correct that none of our tests use it, but our tests also do not model real systems that interact with the environment. I think it is reasonable to require that some reaction should be triggered with a delay after some physical event occurred. For instance, we could require, that an LED is switched on 2 seconds after a button was pressed.

How can we resolve this? I see 3 possible solutions:

Stick to what we have, but make it clear in the documentation, that offsets should not be used on physical actions by normal users (we could even forbid it in the language).
As pointed out by @edwardalee we could reimplement the federated communication to use some other mechanism than physical actions to insert events received over the network. Then we could either:
1. revert to the previous behavior
2. forbid offsets on physical actions altogether

Soroosh129 commented 4 years ago

For instance, we could require, that an LED is switched on 2 seconds after a button was pressed.

This is currently not doable unless you have an infinitely fast processor. Assigning T + 2 secs to an event does not mean you are imposing a 2 seconds delay after T, or after the press of the button. The outcome depends on the value of t (current logical time) at the time of the press of the button. It will take, say α seconds (α >= T - t) for the LF program to reach T and another β seconds (β >= (T + 2) - T) for the LF program to reach T + 2. Notice that α + β >= (T + 2) - t. The idea of t+d can be summarized as changing this equation to α + β >= ((t+d) - t) = d with a big enough d. Notice that contrary to your original issue, the lower bound on the actual time it takes to process an event becomes independent of t or T, and that can be interpreted as predictability.

lsk567 commented 4 years ago

The issue in example 3 can be addressed by minimum spacing (or minimum inter-arrival time, MIT), first mentioned in #98 . To prevent one event from being replaced by another, the user can specify a reasonable minimum spacing with the defer policy. All events will then be preserved.

I want to quickly recap the definitions of logical and physical actions. The distinction between logical and physical actions is that logical actions are synchronous (scheduled by reactions at a logical instant) and physical actions are asynchronous (scheduled at an arbitrary point in the physical time). #129 makes physical actions behave more deterministically when t + d > T.

If I can summarize the first two problems you mentioned above, they are:

There is a use case for adding delays to the physical time.
In Ptides, due to the independent advancement of logical timelines, there is uncertainty in determining the correct order of two events in different federates.

I think these problems can be addressed by allowing the user to choose how they want to timestamp physical actions, either t+d or T+d, instead of offering one interpretation max(t+d, T). It seems to me that the use cases are application-specific and the problem originates from merging two ways of timestamping physical actions by taking the max.

Then back to example 1 and 2. The problems can be resolved by setting the "delay policy" to be T + d. Example 1 then has a way to realize this use case. Example 2 will have a deterministic order (assuming clocks in federates are synchronized).

To do this in the LF syntax. We can extend the physical action declaration from

physical action name(min_delay, min_spacing, policy):type;

to

physical action name(min_delay, delay_policy, min_spacing, spacing_policy):type;

By setting the delay_policy, the user can specify whether they want t + d or T + d.

cmnrd commented 4 years ago

This is currently not doable unless you have an infinitely fast processor.

Sure, but this is not my point. I am aware that executing a reaction precisely at T + 2 secs is not possible. However, we do have the deadline to specify the maximum lag between logical and physical time.

Assigning T + 2 secs to an event does not mean you are imposing a 2 seconds delay after T, or after the press of the button. The outcome depends on the value of t (current logical time) at the time of the press of the button. It will take, say α seconds (α >= T - t) for the LF program to reach T and another β seconds (β >= (T + 2) - T) for the LF program to reach T + 2. Notice that α + β >= (T + 2) - t. The idea of t+d can be summarized as changing this equation to α + β >= ((t+d) - t) = d with a big enough d. Notice that contrary to your original issue, the lower bound on the actual time it takes to process an event becomes independent of t or T, and that can be interpreted as predictability.

This lower bound is a bound in logical time. It could be that it makes the distribution of events on the logical timeline more predictable, but my point is that it makes the physical behavior of the system unpredictable. Let me make a simple example. Imagine the button is pressed when t=100 msecs and T=2000 msecs. This means that a new event with tag t + d = 2100 msecs will be inserted, and this event could already be executed at T = 2100 msecs. The physical delay between pressing the button and switching the LED could be as low as 100 msecs. In fact, the precise physical delay is completely random, as we cannot make predictions on the precise value of t at the moment of pressing the button. All we know is t < T. So we could end up with any delay between 0 and the deadline (if given).

cmnrd commented 4 years ago

Yes, I am assuming d = 2 secs.

Yes I understand that. But this is also true with assigning T + d. The physical delay can be much smaller than d (it could be 0 if t=T and there are no events on the event queue).

Ok, so let's imagine t = T = 0 when the button is pressed. When we use T + d the new event would be scheduled with tag T + d = 2 secs. Since t < T (I am assuming realtime execution here), the scheduler has to wait at least 2 seconds before it can advance logical time and process the event.

Soroosh129 commented 4 years ago

I wrote a comment originally that was incorrect (which I have removed).

I see your point about imposing a lower bound delay on physical time, which cannot be achieved using t+d. However, your desired behavior can be replicated by not assigning a delay d to the physical action. Instead, on the triggering reaction, reschedule a logical action with a delay of 2 secs.

cmnrd commented 4 years ago

I see your point about imposing a lower bound delay on physical time, which cannot be achieved using t+d. However, your desired behavior can be replicated by not assigning a delay d to the physical action. Instead, on the triggering reaction, reschedule a logical action with a delay of 2 secs.

Right. I mentioned this also in my initial message above. However, I find this unintuitive, and I am not sure if this really is what we want.

I am sure that there are ways to address all of these problems in some way by working around them. But I would rather address the initial problem. @lsk567's suggestions also make sens, but I am afraid that more annotations complicate things instead of making it simpler. Let me try to rephrase the problem:

We have logical actions for scheduling events from a synchronous source (i.e. from within a reaction). This can be done with or without a delay without any issues. We also have physical actions for scheduling events from an asynchronous source (i.e. outside any reaction). I am not sure whether we really need delays for such asynchronous events, but I consider them handy. And, finally, we have a third use case that is actually both a logical and a physical action: Receiving synchronous events over the network in a decentralized federation. The hybrid nature in this scenario stems from the fact that a synchronous event (it already has a tag) is supposed to be integrated into the logical timeline after it was received asynchronously over the network. This can actually be achieved by scheduling a physical action when the message was received, process the message in a reaction to the physical action and then schedule a logical action with the correct tag to actually insert the received event. (This is the precise process I used in the AUTOSAR Paper). Now I understand that it is desirable to be able to directly insert the event received over the network with the correct tag without using an additional reaction. However, I don't think this mechanism should be a physical action, because what we are really trying to do is to schedule synchronous events. It is this dual use of physical actions that causes the weird behavior in my examples and that causes me a strong headache. I think we should go with what @edwardalee suggested and actually use a third mechanism for inserting synchronous events received over the network.

Soroosh129 commented 4 years ago

I think to expand on @lsk567's suggestion, we could add an origin field to physical actions. If a physical event that happens originates within the LF program, the origin can be internal where the initial t is relevant. If the events are external and thus truly random, then always T is assigned. Do you think that is something that can be intuitively explained?

edwardalee commented 4 years ago

I would like to plea for simplicity. Think about what the user-level documentation will become. I think the right solution is to revert physical actions to assign T+d and reimplement federated communication to not use physical actions.

Soroosh129 commented 4 years ago

I would like to plea for simplicity. Think about what the user-level documentation will become. I think the right solution is to revert physical actions to assign T+d and reimplement federated communication to not use physical actions.

I agree. But in addition to physical connections, current physical actions can also be really handy for GPU calls. Moreover, for a more abstract use-case, current physical actions are also useful for scenarios where a CPS such as an autonomous vehicle would like to measure the feedback of a certain action (such as a steering angle) on the environment (e.g., sensor readings). All these can be done with an alternative method, but I think it is important to consider all the useful use-cases before we change something.

cmnrd commented 4 years ago

Yes, I think we should also carefully consider other use cases that might require the current way of how physical actions work. If we need this functionality, we should think about how to properly integrate it in LF. However, I don't think that mechanism should be a physical action but rather something new. Maybe we need a dedicated syntax for describing asynchronous processes such as offloading some workload to an accelerator and reinserting the result in the logical time line (I was thinking about such a mechanism also to offload more complex computations to external threads in order to avoid stalling the execution of other reactions for too long).

edwardalee commented 4 years ago

Perhaps those external physical processes should simply be treated as federates.

cmnrd commented 4 years ago

In order to give us a better basis to work with and argue about, I implemented a few examples in C and C++ (where C uses the max(t+d, T) method and C++ uses T+d). You can find them in example/Physical Actions in the physical-action branch.

1. Example

The first example is the Button -> LED example I used above. The C++ implementation (ButtonWithDelay.lf) works as expected and always toggles the LED with a delay of about 2 seconds: Screenshot_20201029_150719

The naive C version is using an offset when scheduling the physical action (ButtonWithDelayNaive.lf). It toggles the LED with a nondeterministic delay: Screenshot_20201029_151124 Screenshot_20201029_151145

This can be avoided, however, by using a physical action without delay and a logical action to implement the delay (ButtonWithDelay.lf).

To my surprise, the example did not show the behavior I described under 3 in my first post. This is because the default behavior of the C runtime is to schedule the event at the next possible tag. Therefore, the reaction is always triggered twice when the button was pressed two times. Note, however, that the information about the physical delay between the two button presses is lost.

2. Example

The second example uses 2 buttons and determines which button was pressed first. Thereby, a delay of 2 seconds is used on both physical actions. This works reliably for C++ (TwoButtonsWithDelay.lf). The naive C version (TwoButtonsWithDelayNaive.lf), however, fails to determine the correct order: Screenshot_20201029_152110

Also here, a version with additional logical actions (TwoButtonsWithDelay.lf) works correctly.

3. Example

The third example mimics the behavior of asynchronous callbacks. Such a scenario was mentioned by @Soroosh129 as a viable use case for the current C semantics. The goal here is to perform a computation in an asynchronous process and then insert the result at a well-defined logical time. In this example the asynchronous computation takes about 100 msecs of physical time. The logical delay which should be used for reinserting the event is 200 msecs. This means that if the asynchronous computation is tarted at logical time t1, the result of this computation should be handled at logical time t2=t1 + 200 msecs.

The C++ implementation (AsyncCallback.lf) uses a physical action and a logical action to implement this behavior. The physical action is scheduled without a delay when the asynchronous process terminates. The reaction to the physical action determines the correct logical delay and schedules the logical action. This works reliably in C++: Screenshot_20201029_154354

The C implementation (AsyncCallback.lf) avoids the detour of the logical action and directly schedules the physical action with a delay of 200 msecs. This works similar to the C++ version, as long as there are no other scheduled events. However, as soon as we add another timer

 // a dummy timer and reaction just to keep logical time moving forward
timer t2(0, 33 msec);
reaction(t2) {==}

the behavior becomes unpredictable: Screenshot_20201029_153923

This clearly points at the issue that was already mentioned by @lhstrh, the logical time may change while the asynchronous process runs. However, I am not sure if I implemented the C version in the way it was intended by @Soroosh129. Could you have a look at it?

Soroosh129 commented 4 years ago

The correct way of calling schedule_int(result, MSECS(200), 42); for this example would be to do so in reaction(t) line 31, which is of course not asynchronous.

cmnrd commented 4 years ago

But how would you know the result of the asynchronous computation (42) in reaction(t)? This needs to be retrieved somehow from the asynchronous process via a physical action.

Soroosh129 commented 4 years ago

Yes I agree that it is not really what physical actions are meant for.

lhstrh commented 4 years ago

It’s the physical-action, not the physical-connection branch.

On Fri, Oct 30, 2020 at 8:14 PM Edward A. Lee notifications@github.com wrote:

I don't see example/Physical Action in this branch:

EALMAC:example eal$ pwd

/Users/eal/git/lingua-franca/example

EALMAC:example eal$ ls

Autoware HTTPServerReactor README.md

Billiard Intersection ReflexGame

Distributed MQTT ScatterGather

DistributedTS PowerTrain TrainDoor

HTTPSRequestReactor ProtocolBuffersC

EALMAC:example eal$ git branch

issue184

master

multiports

new_import

physical-connection

python

tracing_in_c

EALMAC:example eal$ git status

On branch physical-connection

Your branch is up to date with 'origin/physical-connection'.

Untracked files:

(use "git add ..." to include in what will be committed)

...

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/icyphy/lingua-franca/issues/227#issuecomment-719875426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYD47DLPDL25OF4PICFPDDSNN6K7ANCNFSM4TAREVWA .

--

-- Marten Lohstroh, MSc. | Ph.D. Candidate University of California | 545Q Cory Hall Berkeley, CA 94720 | +1 510 282 9135

edwardalee commented 4 years ago

These examples are pretty compelling. So let me summarize what I think we want to do (Edited to reflect the comments below):

Revert physical actions so that the assigned tag is always (T+d,0), where d is the min delay + extra delay and T is the physical time at the receiver. If T+d == t, the logical time, then the tag will be (T+d,1) instead. FIXME: minimum spacing and could be (T+d,n).
For physical connections, whether the connection is within a federate or across federates, the assigned tag is (T+a,0), where a is the after value and T is the physical time at the receiver. If T+a == t, the logical time, then the tag will be (T+a,1) instead. This implies that a physical connection between federates need not carry a timestamp.
Ensure that if a physical action or physical connection exists in a federate (or in the program, for the unfederated case), we always have t <= T. FIXME: realtime reactors. That is, logical time cannot exceed physical time. This implies: a. "fast" is ignored. b. In a federated execution with decentralized control, for any federate with physical actions, incoming physical connections, or local physical connections, STP >= 0.
For federated execution with decentralized coordination, if an incoming message on a logical connection is tardy and there is no tardy handler on any reaction triggered by this input, then we have a critical failure and the federate should exit. If there is a tardy handler, then it should be invoked instead of the regular reaction. The validator should issue a warning if there is no tardy handler. The tardiness should be zero or greater, where zero indicates a tardiness of one or more microsteps.

This requires changing quite a lot of things. So let's be sure we got it right before proceeding.

Feel free to edit the above.

Soroosh129 commented 4 years ago

If I understand correctly, you want to let t on the receiver go past T to a certain point depending on the calculated STP. However, this might go against the intent of the delay in physical actions and physical connections. Imagine that schedule is called on a physical action at T with a delay d. The intent as @cmnrd put it is to impose a minimum delay of d on processing the event. If at the time of calling schedule, t = T + d, then the response will be immediate.

edwardalee commented 4 years ago

Ah, good point. Perhaps we should change it so if there is any physical action, then STP >= 0. What about physical connections? Should they also prohibit ahead-of-time computation?

Soroosh129 commented 4 years ago

I believe if we allow ahead-of-time computation for physical connections, the delay in after will only have a niche usage to impose a poor-man's fast mode. Maybe it will become as confusing as it was for physical actions? We can already impose the poor-man's fast mode by adjusting the STP offset to a negative value. I have a demo for this that I have not pushed yet.

edwardalee commented 4 years ago

Yes, I think that we should err on the side of simplicity and treat physical actions and physical connections as closely as we possibly can. In this case, the presence of either implies STP >= 0.

edwardalee commented 4 years ago

I have edited comment https://github.com/icyphy/lingua-franca/issues/227#issuecomment-719878156 to adopt the above suggestions.

lf-lang / lingua-franca