RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.17k stars 1.24k forks source link

[framework] Speed up feedthrough calculation during diagram building #21632

Open jwnimmer-tri opened 1 week ago

jwnimmer-tri commented 1 week ago

When possible, we now avoid calling AllocateContext when answering direct feedthrough queries.


The fix at #21630 got me thinking about what happens during DiagramBuilder::Build(). That initial fix is of primary importance -- relying on the default all_sources_ticket() for an output port declaration is all kinds of pessimistic, and inappropriate for any LeafSystem that's provided as part of Drake, nevermind one of such prominence.

Looking deeper, though, even after we fix the SceneGraph<Expression> problem, any system with input and output ports still always allocates a Context during the feedthrough search. Sometimes that's pretty cheap, but sometimes it is most definitely not!

This PR changes the feedthrough scan so that if a System only mentions boring tickets for its output port prerequisites (i.e., doesn't use any cache entries), then we don't need to make a Context for it during diagram building.

I am a bit more sensitive to thinks like this now that I'm spinning up hundreds of parallel simulations locally on my workstation; start-up lag is actually apparent, in the aggregate.


This change is Reviewable

jwnimmer-tri commented 6 days ago

Is the Context allocation a significant hit in practice ...

For instance, SceneGraph does a crazy amount of stuff when setting up a Context. However, I just realized that most of that work seems to be for setting up a default context, instead of merely allocating a context, so maybe this doesn't change much?

Basically I was curious to get your thoughts about the most likely hot spots or penalties, to see if we should try to dig a bit deeper and look at some stats.

... and do these short circuits avoid it?

For SceneGraph at least, I believe so (but I haven't tested it) -- all of the SG output ports declare all_input_ports_ticket() so this function is really fast and boring now.

I wanted to get a second opinion before I dug in too deep.

jwnimmer-tri commented 4 days ago

Benchmarks from bazel run //systems/benchmarking:framework_experiment -- --output_dir=try1 --sleep=1 -- --benchmark_filter=DiagramBuild:

On master:

----------------------------------------------------------------------
Benchmark                  Time             CPU    Allocs   Iterations
----------------------------------------------------------------------
DiagramBuild/3/0       0.196 ms        0.197 ms      2.7k         3546
DiagramBuild/30/0       9.66 ms         9.66 ms    166.4k           73
DiagramBuild/3/1       0.455 ms        0.454 ms     13.9k         1553
DiagramBuild/3/2        1.06 ms         1.06 ms     54.2k          672

On this PR:

DiagramBuild/3/0       0.228 ms        0.228 ms      2.7k         3006
DiagramBuild/30/0       11.2 ms         11.2 ms    166.4k           62
DiagramBuild/3/1       0.523 ms        0.523 ms     13.8k         1338
DiagramBuild/3/2        1.23 ms         1.23 ms     53.5k          588

On master after changing Adder to declare output port dependencies:

DiagramBuild/3/0       0.063 ms        0.063 ms      1.5k        10993
DiagramBuild/30/0       1.71 ms         1.71 ms     31.6k          406
DiagramBuild/3/1       0.123 ms        0.123 ms      6.8k         5722
DiagramBuild/3/2       0.287 ms        0.287 ms     25.9k         2411

On this PR after changing Adder to declare output port dependencies:

DiagramBuild/3/0       0.019 ms        0.019 ms   801.125        37326
DiagramBuild/30/0      0.928 ms        0.928 ms     14.1k          756
DiagramBuild/3/1       0.036 ms        0.036 ms      3.1k        19451
DiagramBuild/3/2       0.083 ms        0.082 ms     10.4k         8504

Adder patch:

--- a/systems/primitives/adder.cc
+++ b/systems/primitives/adder.cc
@@ -13,7 +13,7 @@ Adder<T>::Adder(int num_inputs, int size)
     this->DeclareInputPort(kUseDefaultName, kVectorValued, size);
   }

-  this->DeclareVectorOutputPort("sum", size, &Adder<T>::CalcSum);
+  this->DeclareVectorOutputPort("sum", size, &Adder<T>::CalcSum, {this->all_input_ports_ticket()});
 }

 template <typename T>
sherm1 commented 4 days ago

Looks great except for the "801.125" allocations in this row:

DiagramBuild/3/0 0.019 ms 0.019 ms 801.125 37326

Is that spurious?

jwnimmer-tri commented 4 days ago

Other rows have k for "thousand", so it's a move in a good direction.