Open ronag opened 1 month ago
I would assume that Buffer.set and Buffer.asciiWrite should be roughly the same (they are both essentially a memcpy). However this is not the case.
Buffer.set
Buffer.asciiWrite
memcpy
Given that the speed is somewhat the same even when the string size grows I would assume that most goes to call overhead.
cpu: Apple M2 Pro runtime: node v21.6.0 (arm64-darwin) benchmark time (avg) (min … max) p75 p99 p999 -------------------------------------------------- ----------------------------- • 8 -------------------------------------------------- ----------------------------- asciiWrite 34.29 ns/iter (31.17 ns … 493 ns) 34.77 ns 40.71 ns 70.13 ns loopWrite 10.27 ns/iter (9.28 ns … 1'906 ns) 10.11 ns 14.97 ns 21.99 ns bufWrite 13.31 ns/iter (11.47 ns … 5'753 ns) 12.43 ns 15.65 ns 49.74 ns summary for 8 loopWrite 1.3x faster than bufWrite 3.34x faster than asciiWrite • 16 -------------------------------------------------- ----------------------------- asciiWrite 33.56 ns/iter (31.35 ns … 160 ns) 33.77 ns 39.75 ns 66.26 ns loopWrite 20.99 ns/iter (18.17 ns … 6'464 ns) 19.08 ns 26.37 ns 57.6 ns bufWrite 13.7 ns/iter (12 ns … 7'717 ns) 12.92 ns 16.48 ns 34 ns summary for 16 bufWrite 1.53x faster than loopWrite 2.45x faster than asciiWrite • 32 -------------------------------------------------- ----------------------------- asciiWrite 37.25 ns/iter (34.87 ns … 221 ns) 37.62 ns 43.68 ns 79.14 ns loopWrite 92.46 ns/iter (75.13 ns … 9'072 ns) 82.25 ns 228 ns 1'358 ns bufWrite 13.6 ns/iter (12.57 ns … 473 ns) 13.61 ns 17.01 ns 32.76 ns summary for 32 bufWrite 2.74x faster than asciiWrite 6.8x faster than loopWrite • 64 -------------------------------------------------- ----------------------------- asciiWrite 37.03 ns/iter (33.41 ns … 4'565 ns) 37.03 ns 43.38 ns 86.69 ns loopWrite 166 ns/iter (150 ns … 2'707 ns) 165 ns 183 ns 2'318 ns bufWrite 12.52 ns/iter (11.72 ns … 554 ns) 12.63 ns 15.83 ns 23.99 ns summary for 64 bufWrite 2.96x faster than asciiWrite 13.25x faster than loopWrite
import { bench, group, run } from 'mitata' const BUF = Buffer.allocUnsafe(64).fill(88) const BUF_BUF = new Array(64).fill(null) function asciiWrite(str) { BUF.asciiWrite(str, 0) } function bufWrite(str, index) { BUF.set((BUF_BUF[index] ??= Buffer.from(str))) } function loopWrite(src) { for (let n = 0; n < src.length; n++) { BUF[n] = src.charCodeAt(n) } } const str8 = '01234567' const str16 = '0123456789abcdef' const str32 = '0123456789abcdef'.repeat(2) const str64 = '0123456789abcdef'.repeat(4) group('8', () => { bench('asciiWrite', () => asciiWrite(str8)) bench('loopWrite', () => loopWrite(str8)) bench('bufWrite', () => bufWrite(str8, 1)) }) group('16', () => { bench('asciiWrite', () => asciiWrite(str16)) bench('loopWrite', () => loopWrite(str16)) bench('bufWrite', () => bufWrite(str16, 2)) }) group('32', () => { bench('asciiWrite', () => asciiWrite(str32)) bench('loopWrite', () => loopWrite(str32)) bench('bufWrite', () => bufWrite(str32, 3)) }) group('64', () => { bench('asciiWrite', () => asciiWrite(str64)) bench('loopWrite', () => loopWrite(str64)) bench('bufWrite', () => bufWrite(str64, 4)) }) await run()
Refs: https://github.com/nodejs/performance/issues/168
I would assume that
Buffer.set
andBuffer.asciiWrite
should be roughly the same (they are both essentially amemcpy
). However this is not the case.Given that the speed is somewhat the same even when the string size grows I would assume that most goes to call overhead.
Refs: https://github.com/nodejs/performance/issues/168