RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.34k stars 1.27k forks source link

Question about event handling callback that mutates state. #5913

Closed siyuanfeng-tri closed 6 years ago

siyuanfeng-tri commented 7 years ago

@EricCousineau-TRI and I had a conversation yesterday about event handling in System / Simulator. We think per tick / step, the state should be only mutated at most once, so that invoking callbacks in different orders would not have an effect for the resulting state. However, we have CalcUnrestrictedUpdate (changes abstract, discrete, and continuous state) and CalcDiscreteVariableUpdates(discrete). Shouldn't we have either only CalcUnrestrictedUpdate or CalcDiscreteVariableUpdates + CalcContinuousVariableUpdates + CalcAbstractVariableUpdates?

@jwnimmer-tri @david-german-tri @sherm1 @edrumwri

david-german-tri commented 7 years ago

I've always thought so too. @sherm1 pushed for the current, oddly asymmetric setup, so perhaps he can explain why.

siyuanfeng-tri commented 7 years ago

another related question is shouldn't there be guards on the number of mutable callbacks of the same type per tick? (you shouldn't schedule 2 unrestricted updates per tick). At least we should prevent this in CalcNextUpdateTime. Resolving a collision from CalcNextUpdateTime and GetPerStepEvents seems harder.

jwnimmer-tri commented 7 years ago

I'm all for simplicity, but I think its important to point out that we can have deterministic order of evaluation and operations even if multiple events of different types are declared; we'd just need a total order on how a Simulator uses the System's contract (as expressed by preconditions on the System base classes APIs).

sherm1 commented 7 years ago

In general an arbitrary and unpredictable number of events can occur simultaneously, and they are detected and handled by a variety of uncoordinated subsystems. Handling events can entail arbitrary state changes. I had originally proposed that all event handlers would be like unrestricted update -- basically a free-for-all modification of the Context. David had the idea of creating limited handler APIs that could only modify pre-declared portions of the Context, which is why we have a discrete-only handler as well as the unrestricted one and we could have more of those if needed. We could of course do everything with just the unrestricted update, though I liked David's idea and still do. (It has another benefit for rigorously defining how we treat simultaneous events.)

Regarding the OP, I'm not sure I understand the proposal. Discrete events must be handled at start of step -- those can involve changes to all variables, including continuous and discrete states. The continuous update that follows is done by integration and necessarily produces changes to only continuous variables. So discrete & abstract variables get changes at step start, continuous variables get changed (discretely) at step start and then again during integration. That is how it should be!

We can discuss more f2f if you want.

siyuanfeng-tri commented 7 years ago

I had originally proposed that all event handlers would be like unrestricted update -- basically a free-for-all modification of the Context. David had the idea of creating limited handler APIs that could only modify pre-declared portions of the Context, which is why we have a discrete-only handler as well as the unrestricted one and we could have more of those if needed.

That's what made us wondering. I am perfectly happy with either approach. I am cautious about the combined one. This is my complaint number 1.

In general an arbitrary and unpredictable number of events can occur simultaneously, and they are detected and handled by a variety of uncoordinated subsystems.

I agree. Sorry I wasn't clear, let me try again. Here is my complaint number 2. I think there are three types of events, time based, state based, and per step. Time based are scheduled with CalcNextUpdateTime, per step are through GetPerStepEvents, the state triggered ones are basically witness functions. I think setting a fixed sequence for handling simultaneous events of different types definitely makes sense. i.e. when a per step event and a periodic event occur at the same time, we have to impose some arbitrary order in handling. However, it doesn't really make sense to schedule two timed events at the exact same time to mutate the same part of the state. i.e. some system scheduled two timed kDiscreteUpdateAction events both at T. On the other hand, touching the AbstractValues or DiscreteValues or ContinuousState at the same time is ok. The user shouldn't be able to touch the same part multiple times. Currently, there is no mechanism that checks whether multiple events of the same DiscreteEvent::ActionType are being returned in CalcNextUpdateTime.

So my proposal is to add kAbstractUpdateAction, kContinuousUpdateAction to DiscreteEvent::ActionType, and provide the appropriate API in system. Then add checks in CalcNextUpdateTime and GetPerStepEvents to catch returning multiple mutation events that touches the same part of the state.

sherm1 commented 7 years ago

@siyuanfeng-tri, each subsystem in a diagram only gets to modify its own subcontext. It is not like they can all stomp on every abstract state variable in the diagram multiple times. Is that what you're worried about?

siyuanfeng-tri commented 7 years ago

i am worried about some user accidentally decided to modify it's context multiple times.

sherm1 commented 7 years ago

Per f2f (with @siyuanfeng-tri @edrumwri @EricCousineau-TRI @sherm1): We propose to modify the leaf system event handler interfaces (unrestricted update, discrete update, publish) to pass in the list of simultaneous event triggers that caused the invocation, in case the leaf system wants to impose an ordering or otherwise provide special handling for simultaneous triggers.

Likely implementation: the Simulator will make a single per-step call to the System's handlers (in the order: unrestricted-discrete-publish) with all simultaneous triggers present. If that System is a Diagram, the Diagram will group the event triggers by subsystem to ensure that each leaf subsystem gets to handle all its simultaneous events at once.

siyuanfeng-tri commented 7 years ago

thanks @sherm1 for the summary.

in the order: unrestricted-discrete-publish

how about just unrestricted-publish ? to make it evener cleaner.

sherm1 commented 7 years ago

A further consequence of this is that all leaf subsystem events must go through one of the three designated handler methods (unrestricted/discrete/publish). There is a provision for individual callbacks in the event action infrastructure, but it currently is used only by Diagram to dispatch to leaf subsystem's handlers (which is fine!). We'll need to institutionalize that policy.

sherm1 commented 7 years ago

how about just unrestricted-publish ?

We want the discrete update for handling difference equations because it is the exact discrete analog of the continuous update; this makes the math beautiful. Unrestricted update muddies the waters. However, you are totally free not to use the discrete update method at all since unrestricted update is ... unrestricted!

siyuanfeng-tri commented 7 years ago

usage clarified. thanks

edrumwri commented 7 years ago

@siyuanfeng-tri would you mind doing the PR on this, since you have recently worked on similar aspects for Diagram? Alternatively, we could do them together, with me doing the Leaf system and you doing the Diagram.

@sherm1 and I just discussed- we would like to use the approach the four of us discussed this morning to remove the function pointers from UpdateActions (so Diagram's DoUnrestrictedUpdate could route to the appropriate systems using the to-be-implemented list of triggered events).

siyuanfeng-tri commented 7 years ago

@edrumwri i am happy to do it :)

edrumwri commented 7 years ago

Great! I've been looking at this stuff quite a bit lately, so I think I can be a good resource as you work through it.

Thanks!

edrumwri commented 6 years ago

This issue is very stale; @siyuanfeng-tri worked on this in a series of PRs and I believe all relevant issues have been addressed. Please reopen if not the case.