Potential small performance optimizations

dead-claudia commented 7 years ago

Edit: My benchmark memories are apparently out of date...

I haven't actually profiled this, but I thought I'd take a gander through a few things.

~~In this function, it's much faster to use const out = arr.slice(), since engines provide optimized code gen for it.~~
~~In this function and this function, see if starting with an empty array literal is faster. (For some reason, engines are often faster with that than with a pre-allocated array.)~~
Consider using k != null ? k.key : undefined, etc. instead of k && k.key. Engines struggle to optimize the latter, especially when it's not like if (x) ....
Be careful about polymorphism. That will come back to bite you very quickly performance-wise if you're not careful, and the library has a very large amount of it, largely by necessity.
If it's super simple like this, this, or this, you might as well just inline it. The function call overhead, especially when dealing with higher order functions, could screw up the engine's optimizer, giving you megamorphic perf hits earlier than what you'd like.
Avoid closures where you can. They get deadly in a hurry, and no JS engine is quite as smart and magical as LuaJIT is when it comes to optimizing higher order functions.

Also, have you considered using class Map { ... } for hamt.Map? It might smooth out your code a little by moving the class aliasing boilerplate out of the functional API stuff.

dead-claudia commented 7 years ago

To clarify 5, not everything needs inlined, and in particular, manual inlining only really helps in cases that are already highly polymorphous. Also, it only makes sense if the intent is just as clear after the inlining (like with constant).

wishfoundry commented 7 years ago

in my experience tuning down https://github.com/rrbit-org/lib-rrbit

Array#slice is almost never faster
the array literal hack is specific to safari
inlining should be avoided, large functions exceed caching mechanisms (but varies on engine)
avoiding an extra closure context by inlining is usually worth it

dead-claudia commented 7 years ago

@wishfoundry

Array#slice is almost never faster

the array literal hack is specific to safari

Now that I'm running a few benchmarks, I stand corrected. (It's still negligible in practice, and I do recall V8 used to be a little more even.)

inlining should be avoided, large functions exceed caching some mechanisms(but varies on engine)

avoiding an extra closure context by inlining is usually worth it

That's true with smaller functions, especially ones that just type-check (the engine normally inlines them), but in some megamorphic contexts, I've gotten some serious performance gains from inlining manually, because the engine didn't think to inline a megamorphic call, but it could get better type information after inlining.

For example, this library I managed to get up to about 50% of Lodash's _.matches in speed in Node.js, while doing about 3-4x the work. In particular, it checks numerous types and does out-of-order matching for maps and sets, which complicated things significantly. (I had to also do some other optimizations to avoid pathological runtime comparison of them, and I had to find clever ways to avoid type checks to speed up the common case of primitives.)

mattbierner commented 7 years ago

I tried a few of these but didn't see too much a difference in artificial benchmarks. Still I've gone ahead and removed constant (https://github.com/mattbierner/hamt/commit/027c8f852dfd1c9032d59bf2b471e0034e1c17da) since it is called for every set and modify and delete operation, and using nothing was sort of ugly anyways.

Please submit a PR if you think there are further micro-optimizations opportunities. I'm hesitant to inline much else however unless it results in a significant performance boost.

wishfoundry commented 7 years ago

@mattbierner what are you using to test performance?

mjbvz commented 7 years ago

I put together some simple benchmarks a while back: https://github.com/mattbierner/js-hashtrie-benchmark They only cover node and are completely artificial however

dead-claudia commented 7 years ago

@mjbvz When I looked at the posted results, I noticed they're pretty out of date. A few things of note:

Immutable.js wasn't out for that long in the last results post.
Mori is tied to ClojureScript's performance, and they're always trying to optimize things.
V8 has completely swapped their compiler pipeline since the last update, so most of the performance curves have dramatically changed. In particular, try/catch and most other notoriously bad things to use no longer deopt, and temporary objects are now sometimes elided, even as return values.

mattbierner commented 7 years ago

I've rerun the benchmarks with the most recent versions of the libs and posted the results. Some nice perf gains across the board on node 8.6, although I suspect upgrading the test machine from a 2009 laptop also helped matters

mattbierner / hamt

Potential small performance optimizations #27