Open rakudrama opened 1 month ago
Was dart2js
benchmarked before and after?
//cc @alexmarkov
@rakudrama did you get a chance to measure the performance impact of this
@a-siva I did some timing locally.
I compiled dart2js with dart compile exe
and then used that aot executable to compile Flute.complex.
I aggregated the wall-clock time reported by dart2js over 100 runs on my cloudtop instance.
I ran the versions alternating A,B,A,B,..., 100 of each
Version | executable size | mean time (seconds) | std dev | min | max |
---|---|---|---|---|---|
A - inlined | 26322640 | 21.069 | 0.910241 | 20.0 | 24.6 |
B - not inlined | 25865344 | 21.09 | 0.729178 | 19.9 | 23.7 |
They are surprisingly close. Perhaps none of the ~2k calls to add
are hot enough to register in this kind of profiling.
The above runs were done when my computer was otherwise quiet.
And earlier run of 60 A,B pairs is noisier. I was using a browser at the same time.
Version | executable size | mean time (seconds) | std dev | min | max |
---|---|---|---|---|---|
A - inlined | 26322640 | 22.185 | 1.39174 | 19.9 | 24.7 |
B - not inlined | 25865344 | 22.3917 | 1.56549 | 20.0 | 27.7 |
Here all of the values (mean, min, max) are worse, but the variance is high enough that this is not clearly a difference.
Inlining of
_GrowableList.add
causes too much code size expansion.The extra code in the aot-snapshot for dart2js is 2.22%:
@pragma("vm:prefer-inline")
@pragma("vm:never-inline")
_GrowableList.add
does quite a lot - a covariance check, capacity check (with out-of-line reallocation of backing list), length update, bounds check, array write barrier.One argument for inlining is that after inlining, some of these operations can be eliminated. How effective is inlining at eliminating these operations? Grepping the disassembly allows us to estimate the operations.
as
Most conserved after inlining is the bounds check (that can be eliminated in all cases, see #56099). 9% of the write barriers can be eliminated. Over half of parametric covariance checks can be eliminated, and ~37% remain. So over a third of the calls do not benefit in any special way from inlining. For these calls, the benefit of inlining is limited to the dynamic instruction count, the ~20% of the instructions related to the call - prologue, stack check, epilogue generally vanish after inlining,
For many of the the 60%+ of calls where the parametric covariance check can be eliminated, it might be a better approach to have a specialization of
List.add
that does not do the covariance check, and call that instead of inlining.