ronomon / hash-table

Fast, reliable cuckoo hash table for Node.js.
MIT License
301 stars 12 forks source link

Add browser support? #1

Open josephrocca opened 5 years ago

josephrocca commented 5 years ago

I've recently been having quite a bit of trouble with the browser's built-in Map:

So I went hunting and found your project - it looks great! Is there any interest in making it browser-compatible by detecting if the code is running in a browser and switching out Buffer for ArrayBuffer? Or even just using feross's browser-based buffer polyfill? https://github.com/feross/buffer

For crypto.randomBytes, there's this isomorphic lib: https://www.npmjs.com/package/randombytes

Edit: Here's a working proof of concept: https://gist.github.com/josephrocca/019f091e5f83e410533caeb6f371200e

All I did was:

npm install buffer
npm install randombytes

Then, in index.js, swap require('crypto').randomBytes for require('randombytes'), and then just:

npm install -g browserify
browserify index.js -o HashTable.mjs

I haven't used browserify before so that linked script pulls HashTable and Buffer out of the module in a weird way... but it works! (see bottom of HashTable.mjs)

jorangreef commented 5 years ago

Thanks @josephrocca

Firstly, the benchmark in Chromium 9348 is measuring much more than just map.set(), it's also measuring Math.floor(), Math.random() and string allocations. That's probably the reason for the discrepancy between Chrome and Firefox, e.g. GC kicking in at different times because of all the allocations, a potentially different pseudo-random algorithm etc. I would recommend at least changing your benchmark to pre-allocate strings upfront, and then using a smaller batch of keys against multiple map instances. You might find Chrome's map.set() is actually even faster than Firefox with these things out the way.

Secondly, regarding getting @ronomon/hash-table in the browser:

You should be able to test for performance regressions (if any) by swapping in ArrayBuffer and running benchmark.js (and also running test.js for good measure), just in Node itself.

That said, it would be good to support @ronomon/hash-table in the browser (or at least make it one step easier), and to fall back to ArrayBuffer when Buffer is not present, but none of this should add dependencies.

josephrocca commented 5 years ago

Thanks for the tip about the 9348 bug! Seems like you're right about it having to do with strings/GC - if I take away the toString(16) then the performance becomes fairly reasonable (relative to Rust) - and about twice as fast as Firefox. Very strange that V8 is taking 29 seconds where firefox takes 2 - seems like that's bug-level performance degradation, unless it's a hard trade-off they've had to make. From your comments you're evidently far more informed than me about ArrayBuffer stuff so I'm sorry that I probably won't be too much help with this! Will let you know if, in playing around with this, I come up with anything useful or interesting - currently trying to rewrite parts of a browser project (with max-old-space-size set to system memory amount) to include that prototype and see if I can get V8 to not crash when it gets up to dozens of GB.

jorangreef commented 5 years ago

Anytime! It would be interesting to remove the actual map.set() from the 9348 benchmark to see how the browsers compare on the extra overhead.

Will let you know if, in playing around with this, I come up with anything useful or interesting

Please do, that will be appreciated.

see if I can get V8 to not crash when it gets up to dozens of GB.

Are you testing multiple large ArrayBuffer instances? That might help to avoid crashes. For example, I think a V8 string used to take a minimum of 24 bytes, regardless of what's stored in it.

jorangreef commented 5 years ago

Very strange that V8 is taking 29 seconds where firefox takes 2 - seems like that's bug-level performance degradation, unless it's a hard trade-off they've had to make.

It could also just be a policy difference. But it would definitely be good to figure out exactly what it is.

josephrocca commented 5 years ago

If I remove map.set() firefox takes about 600ms per million, while chrome takes about 1000ms per million. Doesn't seem to change regardless of how many iterations, which makes sense I guess since the engines know they can clean up the string at the end of each iteration of the while loop.

Are you testing multiple large ArrayBuffer instances?

No, I've very naively just tried to create millions of Map objects (each with thousands to hundreds of thousands of keys) and hoped that V8 would sort everything out for me. I'm a mere web developer šŸ˜… Starting to learn more about lower-level memory stuff and learning Rust, so I'll probably submit less-noobish bug reports to V8 crbug over time.

jorangreef commented 5 years ago

learning Rust, so I'll probably submit less-noobish bug reports to V8 crbug over time.

Rust is cool, https://ziglang.org is also worth checking out. Zig's error handling sets are awesome. It strikes an almost perfect balance. It's closer to C than Rust, which is closer to C++.

I'm also a mere web developer and I don't think my own noobish bug reports ever get better over time. šŸ˜… Always worth submitting a crbug!

josephrocca commented 5 years ago

Thanks! I will definitely check out Zig :)