Open jwnimmer-tri opened 1 week ago
Is the Context allocation a significant hit in practice ...
For instance, SceneGraph does a crazy amount of stuff when setting up a Context. However, I just realized that most of that work seems to be for setting up a default context, instead of merely allocating a context, so maybe this doesn't change much?
Basically I was curious to get your thoughts about the most likely hot spots or penalties, to see if we should try to dig a bit deeper and look at some stats.
... and do these short circuits avoid it?
For SceneGraph at least, I believe so (but I haven't tested it) -- all of the SG output ports declare all_input_ports_ticket()
so this function is really fast and boring now.
I wanted to get a second opinion before I dug in too deep.
Benchmarks from bazel run //systems/benchmarking:framework_experiment -- --output_dir=try1 --sleep=1 -- --benchmark_filter=DiagramBuild
:
On master:
----------------------------------------------------------------------
Benchmark Time CPU Allocs Iterations
----------------------------------------------------------------------
DiagramBuild/3/0 0.196 ms 0.197 ms 2.7k 3546
DiagramBuild/30/0 9.66 ms 9.66 ms 166.4k 73
DiagramBuild/3/1 0.455 ms 0.454 ms 13.9k 1553
DiagramBuild/3/2 1.06 ms 1.06 ms 54.2k 672
On this PR:
DiagramBuild/3/0 0.228 ms 0.228 ms 2.7k 3006
DiagramBuild/30/0 11.2 ms 11.2 ms 166.4k 62
DiagramBuild/3/1 0.523 ms 0.523 ms 13.8k 1338
DiagramBuild/3/2 1.23 ms 1.23 ms 53.5k 588
On master after changing Adder to declare output port dependencies:
DiagramBuild/3/0 0.063 ms 0.063 ms 1.5k 10993
DiagramBuild/30/0 1.71 ms 1.71 ms 31.6k 406
DiagramBuild/3/1 0.123 ms 0.123 ms 6.8k 5722
DiagramBuild/3/2 0.287 ms 0.287 ms 25.9k 2411
On this PR after changing Adder to declare output port dependencies:
DiagramBuild/3/0 0.019 ms 0.019 ms 801.125 37326
DiagramBuild/30/0 0.928 ms 0.928 ms 14.1k 756
DiagramBuild/3/1 0.036 ms 0.036 ms 3.1k 19451
DiagramBuild/3/2 0.083 ms 0.082 ms 10.4k 8504
Adder patch:
--- a/systems/primitives/adder.cc
+++ b/systems/primitives/adder.cc
@@ -13,7 +13,7 @@ Adder<T>::Adder(int num_inputs, int size)
this->DeclareInputPort(kUseDefaultName, kVectorValued, size);
}
- this->DeclareVectorOutputPort("sum", size, &Adder<T>::CalcSum);
+ this->DeclareVectorOutputPort("sum", size, &Adder<T>::CalcSum, {this->all_input_ports_ticket()});
}
template <typename T>
Looks great except for the "801.125" allocations in this row:
DiagramBuild/3/0 0.019 ms 0.019 ms 801.125 37326
Is that spurious?
Other rows have k
for "thousand", so it's a move in a good direction.
When possible, we now avoid calling AllocateContext when answering direct feedthrough queries.
The fix at #21630 got me thinking about what happens during
DiagramBuilder::Build()
. That initial fix is of primary importance -- relying on the defaultall_sources_ticket()
for an output port declaration is all kinds of pessimistic, and inappropriate for anyLeafSystem
that's provided as part of Drake, nevermind one of such prominence.Looking deeper, though, even after we fix the
SceneGraph<Expression>
problem, any system with input and output ports still always allocates aContext
during the feedthrough search. Sometimes that's pretty cheap, but sometimes it is most definitely not!This PR changes the feedthrough scan so that if a System only mentions boring tickets for its output port prerequisites (i.e., doesn't use any cache entries), then we don't need to make a Context for it during diagram building.
I am a bit more sensitive to thinks like this now that I'm spinning up hundreds of parallel simulations locally on my workstation; start-up lag is actually apparent, in the aggregate.
This change is![Reviewable](https://reviewable.io/review_button.svg)