Closed sherm1 closed 2 days ago
After much discussion, Alejandro and I came to the conclusion that the current behavior with caching ON is correct, but with caching OFF the behavior is wrong (though in this case it seemed better). There are two independent problems here:
t[n]
and input ports u[n]
, rather than continuous ones t, u(t)
. The hack mentioned above uses the caching system to achieve that effect. With caching turned off, the MBPlant results now depend on u(t)
and thus picked up the later non-NaN values rather than the initial values which it should have sampled. To say this another way, when running MBPlant in discrete mode we want its outputs y to be discrete with governing equations
y[n+1] = y(t[n], x[n], u[n])
which is what we get currently with caching on. (Note that in Siyuan's example u[0]=NaN, hence the unwanted behavior.) But with caching off, we are getting
y[n+1] = y(t, x[n], u(t))
which is an undesirable hybrid (mixed discrete/continuous) output. (In the example u(t) had been fixed up to be non-NaN so the behavior appeared to be better. )
We are debating how to address this and welcome any ideas. The most obvious fix is that MBPlant in discrete mode should sample its inputs into internal discrete variables, then have the TAMSI calculation depend on those variables rather than directly on the input ports. That would have the same effect as adding a zero order hold in front of the inputs. That would produce the desirable effect that the system would give the same results with caching on or off, which seems like an important invariant.
Thoughts?
@siyuanfeng-tri here is the fix I suggest:
MotionPrimitive::MotionPrimitive(const multibody::MultibodyPlant<double>* plant)
: plant_(*plant) {
// These are being directly copied to output ports.
x_command_index_ = this->DeclareDiscreteState(
systems::BasicVector<double>(plant->num_multibody_states()));
motion_summary_index_ = this->DeclareAbstractState(
AbstractValue::Make<MotionSummary>(MotionSummary()));
// ... declare ports ...
// Ensure above state variables are valid at the start of the
// first step (i.e., after Simulator::Initialize()).
this->DeclareInitializationUnrestrictedUpdateEvent(
&MotionPrimitive::UpdateEverything);
// Then update these variables periodically.
this->DeclarePeriodicUnrestrictedUpdateEvent(1e-3, 0,
&MotionPrimitive::UpdateEverything);
}
systems::EventStatus MotionPrimitive::UpdateEverything(
const systems::Context<double>& context,
systems::State<double>* state) const {
UpdateLogic(context, state);
UpdateCommand(context, state);
UpdateMotionSummary(context, state);
drake::log()->info("MotionPrimitive::UpdateEverything");
return systems::EventStatus::Succeeded();
}
Note: in addition to the initialization event (which was the fix), I switched from the Old School "override the dispatcher" style to what all the Cool Kids do now -- declare their own event handlers. Here UpdateEverything()
is just a private method whereas the DoUnrestrictedUpdate()
dispatcher was an overridden virtual.
I edited the title and description to reflect the investigation. This example revealed several independent issues that need to be addressed:
I'll follow up on these.
Thanks Sherm! I should have remembered the initialization events.. I probably wrote those, or at least pushed really hard for having those in the early days.... I thought MBP's mode (discrete / continuous) is independent of what else in in the diagram, is this not true?? Is the discrete mode you are talking about in the second post coming from TAMSI hack? I am a bit confused here.
You are right @siyuanfeng-tri, whether MBP is discrete/continuous is given at construction, independent of anything else in the parent diagram.
As @sherm1 mentions above, what we'd like to model is y[n+1] = y(t[n], x[n], u[n])
. Notice the very important subtlety here, the discrete input port (not continuous!) u[n]
. We didn't understand this very well before, and when I implemented the contact results output port in discrete mode I "emulated" the discrete behavior of u[n]
entirely motivated on a run-time performance fix. See more details in the in source comment here. My "hack" was to make the contact results port (which in turn yes, it depends on the TAMSI cache entry) dependent on the discrete state xd
. Therefore this discrete input u[n]
"emulation" only works with cache turned on.
Let me see if I can say it another way (sry, this is difficult to explain), the TAMSI cache entry with cache turned on, emulates a new discrete state (as if instead of having a ZOH for u[n]
, which would entail a new discrete state for u, we have a new state for the TAMSI computation). When caching is off, this "emulated" TAMSI state is gone and now instead you get the new (hybrid) system y[n+1] = y(t, x[n], u(t))
(notice t, u(t)
instead of t[n], u[n]
!).
Hopefully that helps?
I realize that the comment in the "hack" should instead say S = S(t[n], x[n], u[n])
. We do have x[n]
, our MBP discrete state. We'd need a new state for u[n]
. I believe that should solve the problem on the MBP side.
makes sense.
Fixed in #21623, as I understand it.
@siyuanfeng-tri ran into a problem in a complicated Diagram where the behavior differed with caching enabled or disabled. I was able to trace this to the TAMSI results cache -- if I disabled caching for just that cache entry, leaving all other caching enabled, everything worked fine. In fact, I could re-enable even that cache entry as long as I did it after an AdvanceTo(0), which serves only to process the time-0 unrestricted and discrete updates.
There is a known-hack for this particular cache entry, well-documented in the code here. TL;DR: the cache entry declares dependency only on discrete state xd but in fact depends on much more.
Tracing in the debugger and discussing with @amcastro-tri , this is what is happening:
This violates the caching Prime Directive -- the answers should be unchanged regardless of whether caching is enabled.