Closed ledbit closed 3 years ago
Update:
--stress-inline
has no effect on perf Any thoughts on what other options to try? For the short term, and since performance hit is so great we're calling utf8Write
directly, but we'd rather not rely on a non-public API function
@nodejs/v8 is there a way to improve the inlining for cases like this one? And could the try ... finally
block be the fault to prevent inlining the code?
@BridgeAR - I tried removing try ... finally
unfortunately it had no effect in performance.
const hideStackFrames = function(fn) {
return function hidden(...args) {
return fn(...args);
}
}
I wonder if hiding the validator frames is worth the performance cost (if I am reading the profiling results correctly)
I wrote this benchmark. Its baseline result for latest master
on my machine is:
$ ./node benchmark/misc/hidestackframes.js
misc/hidestackframes.js n=100000 type="no-error": 20,704,220.948820617
misc/hidestackframes.js n=100000 type="error": 45,215.81646655569
With the following implementation
function hideStackFrames(fn) {
return fn;
}
it shows:
$ ./node benchmark/misc/hidestackframes.js
misc/hidestackframes.js n=100000 type="no-error": 48,594,383.16962696
misc/hidestackframes.js n=100000 type="error": 45,900.658629285084
So, there is a certain penalty introduced by hideStackFrames
. I'm currently thinking of ways to move the penalty to the exception-path instead of success-path, while keeping the same behavior.
Update. I've tried re-running the benchmark and I can no longer see any difference, so please ignore this message. Disabling inlining with --max-inlined-bytecode-size=0
also doesn't make any difference, except for reducing results for both hideStackFrames
implementations.
In fact, this is significant enough only for the shortest and simplest library calls, @ledbit 's example uses 8 bytes buffers, but still, why not
Seeing this on another Buffer code-path
What steps will reproduce the bug?
I have not been able to create an isolated example outside of the application that reproduces the perf degradation - likely due to optimization.
When profiling our application (Cribl LogStream) we noticed that the top function call was a function called
hidden
- after some digging it turns out that the call trace is something like thisafter modifying the application to all the undocumented
Buffer.utf8Write
instead ofBuffer.write
we see about 20% overall improvement and the heavy bottom profile looks like follows - note during both times the application was profiled for same amount of time (30s).I noticed the same performance improvement after updating hideStackFrames to look like this:
I have not been able to reproduce the perf degradation using a script that isolates just
Buffer.write
operations. I don't even see thehidden
function calls at all during profiling. However, when I set a breakpoint inhideStackFrames
and then start profiling I do end up seeinghidden
in the profile - which make me think there's some optimization/compilation/inlinning issue at play.UPDATE 9/28 I was able to repro the perf degradation by disabling inlining
here's how
buffer.js
looks likeCould this mean that the default V8 inline settings are too conservative for the server side?