Closed lifeart closed 11 months ago
Here is a comparison results with https://github.com/glimmerjs/glimmer-vm/pull/1515#issuecomment-1841497398
Last image - math.max + all changes from linked PR.
I can't see dramatic improvements in performance, but I see difference in results accuracy here:
Here is what I get on FireFox your bench
me changing the input values
thanks for running the benchmark and providing results / reasoning / research / etc! it helps a lot with understanding the why behind this change, rather than just the outcome
@lifeart @NullVoxPopuli please note that the benchmark https://jsben.ch/Qosu7 has some problems.
Looks like "Code block 3" has an error in implementation:
for (const tag of values) {
result = Math.max(values[tag], result);
// ^ maybe Math.max(tag, result)
}
This may have some penalty in speed (actually negligible in my measurements). Anyways, you should pay more attention to the benchmarked code.
The data under test is the series of increasing numbers from zero to 19 [0, 1, 2, ... 19, 0, 1, 2, ... 19, ...]
. It is very predictable and may affect the benchmark results.
The data creation is placed in the "boilerplate" part, which is part of the benchmark. Time for initialization of the array is added to every variant of code. Array initialization should be placed in the "setup" part, which isn't benchmarked.
The way of array creation may affect performance. I.e. var values = []; ... values.push(...);
(FixedArray with PACKED_SMI_ELEMENTS in V8) may behave differently from var values = new Array(5000); ... values[i] = ...;
(FixedArray with HOLEY_SMI_ELEMENTS in V8). Benchmark should use the same creation method as in the original code.
Here is the improved benchmark: https://jsben.ch/eNWlG and its results:
Chrome 120 on Apple M1:
Safari 17.2 on Apple M1:
Firefox 122 on Apple M1:
Results for Windows 10 64-bit on AMD Ryzen 7 5800H CPU:
Chrome 121
Firefox 122
seems like it's worth a PR, if you have the time! :tada: nice work!
I'd say that the classical for(;;)
loop with if()
is the best choice taking into account all the platforms it may run on.
seems like it's worth a PR, if you have the time! 🎉 nice work!
tsconfig has "noUncheckedIndexedAccess": true
and to pass type checking the loop should look like
for (let i = 0; i < subtag.length; i++) {
const value = subtag[i]![COMPUTE]();
revision = Math.max(value, revision);
}
or
for (let i = 0; i < subtag.length; i++) {
const tag = subtag[i];
if (tag !== undefined) {
const value = tag[COMPUTE]();
revision = Math.max(value, revision);
}
}
Which one do you prefer?
Fortunately, both variants show close results in benchmark https://jsben.ch/we4qL on current versions of Chrome, Firefox, and Safari.
It seems reduce is 5 times faster for our case https://jsben.ch/Qosu7
Here is V8
Math.Max
implementation: https://github.com/v8/v8/blob/cd81dd6d740ff82a1abbc68615e8769bd467f91e/src/js/math.js#L77-L102 (old) https://github.com/v8/v8/blob/04f51bc70a38fbea743588e41290bea40830a486/src/builtins/math.tq#L135 (new)It seems it uses more code paths than we actually need:
In our case both args already numbers and we could take 1 step instead of 5
If we assume every step is an opcode and there is no caching, for case where we have 10 tags + 10 subtags, new comparison should took 100 opcodes, and old (with
Math.max
) - 500. This numbers quite relative to numbers we see in micro-bench.Likely need to check macro-bench to verify