brianloveswords / buffer-crc32

A pure javascript CRC32 algorithm that plays nice with binary data
MIT License
97 stars 30 forks source link

Add benchmark comparing performance of this package with node:zlib.crc32 #36

Open thejoshwolfe opened 1 month ago

thejoshwolfe commented 1 month ago

Mentioned in https://github.com/brianloveswords/buffer-crc32/issues/34#issuecomment-2424754251

Here's my manually formatted table of data:

 few iterations, small buffer, native x   187,828 ops/sec @   5μs/op ± 155.19% (min: 472ns, max:  40μs)
 few iterations, small buffer, js     x    77,053 ops/sec @  12μs/op ± 120.51% (min: 891ns, max:  63μs)
-------------------------------------------------------------------------------------------------------
 few iterations, large buffer, native x    46,038 ops/sec @  21μs/op ±  17.43% (min:  19μs, max:  37μs)
 few iterations, large buffer, js     x     3,024 ops/sec @ 330μs/op ±  92.03% (min: 179μs, max:   1ms)
-------------------------------------------------------------------------------------------------------
many iterations, small buffer, native x 7,246,376 ops/sec @ 138ns/op ±   4.79% (min:  89ns, max:  48μs)
many iterations, small buffer, js     x 9,803,921 ops/sec @ 102ns/op ±  18.42% (min:  57ns, max: 182μs)
-------------------------------------------------------------------------------------------------------
many iterations, large buffer, native x    54,297 ops/sec @  18μs/op
many iterations, large buffer, js     x     5,673 ops/sec @ 176μs/op

Here's what the terminal colors look like:

image

So it looks like the JS<->C++ switching cost is most noticeable with many small inputs (many iterations, small buffer), once the JIT has had time to optimize the code. I'm not totally sure the JIT didn't notice that the input data was the same every time and decided to return a constant; pretty difficult to tell. But i think the more important result is that the native code outperforms the JS code in most cases. In the many iterations, large buffer case, the situation I would argue that performance is the most critical, the improvement is about an order of magnitude.

This uses https://github.com/paulmillr/micro-bmark , which seemed suitable for this use case. I'm using node v20.15.1.