Open H4ad opened 1 year ago
There are multiple performance bottle necks at the same time.
{ __proto__: null }
is slower than a normal object. In fact, I adapted in my projects as the standard:const NullObject = function NullObject () { }
NullObject.prototype = Object.create(null)
as it is by a magnitute faster than { __proto__: null }
Actually it makes sense to use it for e.g. as QuerystringResult object,
ObjectDefineProperties is used in the constructor. This means that on every instantiation of the object, you run this particular slow function on the this reference. Additionally, you are forced to use __proto__
attribute, which means two perf bottlenecks work together, as on every instantiation you need to generate the properties Object again and again and again.
I also tested to use Object.create() instead of ObjectDefineProperties in the constructor. Object.create just needs to run once on creation of the "class", instead on every instantation of the object. Only problem is, that if you want to make a key non-enumerable than if you assign a value to the key then the property descriptor is set back to enumerable. If the behaviour was not like this then I could have still keep the prototypical inheritance in my PR.
To have non-enumerable keys in a instance based on class, you have to add get and set for the field, and to store the information a private attribute.
Afaik prototypical inheritance is about 1-3 % faster in v8. Also a class can not be instantiated without new
operator.
In my PR the gain is significantly bigger than the 3% penalty for using class over prototype. In highly optimized prototypical inheritance code, a plain rewrite to class could result in performance regressions if the gained performance is not above this 1-3% performance loss threshold.
I actually tested proto: null yesterday in relative depth and it does produce slower code and I found out why
V8 uses a heuristic that assumes objects with proto: null are used as dictionaries so they're more prone to be dictionaries instead of hidden classes.
Here is a screenshot of an object with two properties with the only difference being proto set to null
Also closures aren't free and we should really avoid them more in hot paths
Here's a small benchmark output of the impact of __proto__
. I just removed all proto null from lib/events.js
and run the events benchmarks locally:
confidence improvement accuracy (*) (**) (***)
events/ee-add-remove.js n=100000 removeListener=0 newListener=0 *** 15.62 % ±1.95% ±2.60% ±3.40%
events/ee-add-remove.js n=100000 removeListener=0 newListener=1 *** 17.34 % ±1.94% ±2.60% ±3.43%
events/ee-add-remove.js n=100000 removeListener=1 newListener=0 *** 18.44 % ±1.25% ±1.67% ±2.18%
events/ee-add-remove.js n=100000 removeListener=1 newListener=1 ** 5.06 % ±2.98% ±3.97% ±5.16%
events/ee-emit.js listeners=1 argc=0 n=100000 *** 10.92 % ±2.35% ±3.13% ±4.09%
events/ee-emit.js listeners=1 argc=10 n=100000 *** 13.86 % ±2.66% ±3.54% ±4.61%
events/ee-emit.js listeners=1 argc=2 n=100000 *** 18.04 % ±7.29% ±9.76% ±12.84%
events/ee-emit.js listeners=1 argc=4 n=100000 *** 14.18 % ±2.99% ±3.98% ±5.18%
events/ee-emit.js listeners=10 argc=0 n=100000 *** 7.01 % ±2.53% ±3.39% ±4.46%
events/ee-emit.js listeners=10 argc=10 n=100000 *** 4.39 % ±1.56% ±2.07% ±2.71%
events/ee-emit.js listeners=10 argc=2 n=100000 *** 5.32 % ±2.79% ±3.72% ±4.86%
events/ee-emit.js listeners=10 argc=4 n=100000 *** 6.80 % ±1.42% ±1.89% ±2.46%
events/ee-emit.js listeners=5 argc=0 n=100000 *** 6.09 % ±1.92% ±2.55% ±3.32%
events/ee-emit.js listeners=5 argc=10 n=100000 *** 4.38 % ±1.97% ±2.64% ±3.46%
events/ee-emit.js listeners=5 argc=2 n=100000 *** 7.39 % ±2.95% ±3.93% ±5.14%
events/ee-emit.js listeners=5 argc=4 n=100000 *** 5.24 % ±1.69% ±2.25% ±2.93%
events/ee-listen-unique.js n=100000 events=1 *** 68.18 % ±1.25% ±1.66% ±2.18%
events/ee-listen-unique.js n=100000 events=10 -1.35 % ±1.57% ±2.10% ±2.77%
events/ee-listen-unique.js n=100000 events=2 *** -4.98 % ±1.05% ±1.39% ±1.81%
events/ee-listen-unique.js n=100000 events=20 ** -4.18 % ±2.46% ±3.28% ±4.26%
events/ee-listen-unique.js n=100000 events=3 ** -1.78 % ±1.09% ±1.45% ±1.90%
events/ee-listen-unique.js n=100000 events=5 *** -5.16 % ±0.92% ±1.22% ±1.60%
events/ee-listener-count-on-prototype.js n=100000 -1.20 % ±2.37% ±3.16% ±4.12%
events/ee-listeners.js raw='false' listeners=5 n=100000 * -3.61 % ±2.83% ±3.79% ±4.98%
events/ee-listeners.js raw='false' listeners=50 n=100000 -0.83 % ±1.80% ±2.40% ±3.13%
events/ee-listeners.js raw='true' listeners=5 n=100000 ** -4.07 % ±3.03% ±4.04% ±5.28%
events/ee-listeners.js raw='true' listeners=50 n=100000 * -2.88 % ±2.16% ±2.89% ±3.79%
events/ee-once.js argc=0 n=100000 *** 18.89 % ±2.28% ±3.03% ±3.95%
events/ee-once.js argc=1 n=100000 *** 19.67 % ±2.13% ±2.84% ±3.71%
events/ee-once.js argc=4 n=100000 *** 18.70 % ±1.84% ±2.45% ±3.20%
events/ee-once.js argc=5 n=100000 *** 18.36 % ±1.62% ±2.15% ±2.80%
events/eventtarget-add-remove.js nListener=10 n=100000 -0.05 % ±1.76% ±2.35% ±3.06%
events/eventtarget-add-remove.js nListener=5 n=100000 -0.34 % ±0.83% ±1.10% ±1.43%
events/eventtarget-creation.js n=100000 -0.23 % ±3.36% ±4.47% ±5.81%
events/eventtarget.js listeners=1 n=100000 -0.37 % ±3.73% ±4.97% ±6.47%
events/eventtarget.js listeners=10 n=100000 -1.24 % ±3.71% ±4.94% ±6.44%
events/eventtarget.js listeners=5 n=100000 -1.47 % ±3.27% ±4.36% ±5.67%
@anonrig Can you try to do the trick of NullObject
by Uzlopak, just to see if there is any improvement?
@H4ad I tried it in 2022 - https://github.com/nodejs/node/pull/44627
The change in EE would need to have a more throughout anaylis over the large codebase. Our benchmarks that improves (https://github.com/nodejs/node/blob/3116c378d22b0afa420ea0f395049912792579da/benchmark/events/ee-add-remove.js#L26-L29), use the same events name making it better to optimize. If I recall correctly, we switched them to __proto__: null
to have them as dictionaries from the get-go to avoid deopts later on. V8 could have become smarter, and now having them as normal objects might be better, but those benchmarks do not paint the full picture.
From what i can see null prototype objects are slow when created with Object.create(null)
or { __proto__: null }
because they return objects that are not optimized by v8.
Taking a look at nodejs/node#39939 they purpose the following method to overcome:
function objectCreateNull() {
const events = {};
ObjectSetPrototypeOf(events, null);
return events;
}
Which returns an object with null prototype that is optimized but i wonder how it compares with the following sample code when it comes to performance:
function createNullPrototypeObject() {
const obj = {};
obj.__proto__ = null;
return obj;
}
Both this methods return objects that are optimized by v8 and i think using one of them instead of { __proto__: null }
or kEmptyObject
(Object.create(null)
) could bring benefits to node however im unaware of the cons/problems such changes would bring.
It's not a matter of optimized vs. unoptimized really, dictionary mode is faster when you have many event types and _events
is actually used as a dictionary.
One thing we can and probably should do is to pass a different kind of _events
for event emitter types whose events we already know (like streams for example)
FYI, moving from objects to classes uses more memory (the number of fields in the class are higher than in object - verified in system analyzer)
@rluvaton what do you mean ?
With the following test:
class A {
constructor(a, b) {
this.a = a;
this.b = b
}
}
function B(a, b) {
this.a = a;
this.b = b;
}
function objFactory(a, b) {
return {
a,
b,
}
}
function warmShapes() {
const a = new A(1, 2);
const b = new B(1, 2);
const c = objFactory(1, 2);
return [a, b, c];
}
for (let i = 0; i < 1e6; i++) {
warmShapes();
}
// logs: true true true
console.log(%HasFastProperties(new A(1, 2)), %HasFastProperties(new B(1,2)), %HasFastProperties(objFactory(1,2)));
On v20.7.0 I see all shapes are the same same instance size (24) with similar stability and just a different prototype?
FYI, moving from objects to classes uses more memory (the number of fields in the class are higher than in object - verified in system analyzer)
That's a bit subjective. IIRC classes share the same function space in the memory, so if you create more than one object, classes won't use more memory than objects - in fact, objects may happen to be using more memory than classes. I conducted several experiments on this matter in the past. Although it may have changed, I highly doubt it.
Oops, my bad my test was bad 😅
I checked the IR and the code and it's identical. Funnily looking over the code I also found the __proto__: null
heuristic https://source.chromium.org/chromium/chromium/src/+/main:v8/src/runtime/runtime-literals.cc;l=384-391;bpv=1;bpt=1?q=createobjectliteral
And because of the dictionary mode of proto: null
, in this micro-benchmark, https://github.com/RafaelGSS/nodejs-bench-operations/blob/main/v20/v20_0_0/deleting-properties.md, is almost 4x faster to delete properties from a proto: null
object than a plain object.
I tried to replace timerListMap on timers.js
to use SafeMap
to avoid delete
keyword and it was way slower than proto: null
.
Maybe we can search places where we use SafeMap
and try to use proto: null
instead.
Nice finding.
@mcollina Did you know that dictionary mode makes deleting attributes faster? I know that we try to avoid deleting keys because of the perf implication, but maybe this is something which should be spread in adventures in NodeLand?
It's a little bit more complicated than that. If an object is in dictionary mode, accessing all keys is slower in optimized functions. Deleting objects from the dictionary do not cause further slowdowns.
I don't know if this is a pattern but:
All of these PRs improved performance by changing object creation to an instantiation of a class, this could not be the primary reason for the third PR but I think was the main reason for the first two PRs.
Should we look for more patterns like this?