pinojs / sonic-boom

Extremely fast utf8 only stream implementation
MIT License
261 stars 41 forks source link

.write(Buffer) support #174

Closed AVVS closed 11 months ago

AVVS commented 1 year ago

mashed up quick support for .write(chunk: Buffer), works rather well as long as input is a Buffer, in case it is a utf8 string performance degradation is quite bad. See #173 for why it might be needed

if raw buffer is something that is worth exploring further then I'd be happy to put more effort into having fast paths for both types of encoding, probably by preselecting mode of operation at the time of SonicBoom instantiation

benchSonic*1000: 2.189s
benchSonicSync*1000: 8.190s
benchSonic4k*1000: 2.147s
benchSonicSync4k*1000: 1.827s
benchCore*1000: 2.561s
benchConsole*1000: 4.835s
benchSonicBuf*1000: 840.388ms
benchSonicSyncBuf*1000: 6.882s
benchSonic4kBuf*1000: 824.605ms
benchSonicSync4kBuf*1000: 481.976ms
benchCoreBuf*1000: 824.466ms
benchSonic*1000: 2.105s
benchSonicSync*1000: 8.275s
benchSonic4k*1000: 2.179s
benchSonicSync4k*1000: 1.811s
benchCore*1000: 2.220s
benchConsole*1000: 5.287s
benchSonicBuf*1000: 856.635ms
benchSonicSyncBuf*1000: 6.852s
benchSonic4kBuf*1000: 849.188ms
benchSonicSync4kBuf*1000: 480.347ms
benchCoreBuf*1000: 802.708ms
AVVS commented 1 year ago

benchSonicBuf*1000: 595.338ms

benchSonic4kBuf*1000: 590.645ms

benchCoreBuf*1000: 1.073s

There is at least a 2x speed up on my system with this vs core streams when writing buffers, plus proper error handling, so might be worth it, I'll see what the perf without merging small buffers is and whether it's possible to keep the exact performance profile for utf8

AVVS commented 1 year ago

Added contentMode setting. Not sure if the approach I've selected is the best from maintenance standpoint, but benchmarks are ok

benchSonic*1000: 428.597ms
benchSonicSync*1000: 6.054s
benchSonic4k*1000: 408.553ms
benchSonicSync4k*1000: 240.762ms

benchCore*1000: 2.785s
benchConsole*1000: 4.752s

benchSonicBuf*1000: 468.216ms
benchSonicSyncBuf*1000: 6.271s
benchSonic4kBuf*1000: 458.785ms
benchSonicSync4kBuf*1000: 343.907ms
benchCoreBuf*1000: 1.081s

benchSonic*1000: 407.736ms
benchSonicSync*1000: 6.123s
benchSonic4k*1000: 403.239ms
benchSonicSync4k*1000: 236.555ms

benchCore*1000: 2.768s
benchConsole*1000: 5.307s

benchSonicBuf*1000: 455.687ms
benchSonicSyncBuf*1000: 6.225s
benchSonic4kBuf*1000: 454.69ms
benchSonicSync4kBuf*1000: 334.973ms
benchCoreBuf*1000: 1.079s
AVVS commented 1 year ago

Master branch perf for comparison, so utf8 mode (default one) seem to be unaffected by changes

benchSonic*1000: 452.573ms
benchSonicSync*1000: 6.064s
benchSonic4k*1000: 429.246ms
benchSonicSync4k*1000: 258.853ms
benchCore*1000: 2.722s
benchConsole*1000: 4.725s

benchSonic*1000: 422.106ms
benchSonicSync*1000: 6.054s
benchSonic4k*1000: 426.696ms
benchSonicSync4k*1000: 253.348ms
benchCore*1000: 2.724s
benchConsole*1000: 5.305s

Only downside I see is that I have copied non-buffer related functions and changed logic there, so if any logic changes in the future then there are multiple places where this needs to be addressed

mcollina commented 1 year ago

@mmarchini could you take a look as well?

mmarchini commented 1 year ago

Sorry, was drowned in GitHub Notifications for months. I'll take a look at it this week.

AVVS commented 1 year ago

overall looks fine, I didn't review the tests yet. Main concern as you mentioned is code duplication, so it'll be up to the project to decide if taking the burden of having some logic implemented in two places is worth it for this feature.

As for the performance aspects of this, I can see concat being faster than multiple writes, although that'll be highly dependent on system specs. Can you share specs for the machine where you ran your benchmarks?

Chip Apple M1 Max Memory 64 GB

AVVS commented 1 year ago

Updates incoming, in terms of bench

Node 20 performs ~25% better, Node 18 ~ 10%

node 20.3.1

benchSonic*1000: 411.93ms
benchSonicSync*1000: 6.231s
benchSonic4k*1000: 403.094ms
benchSonicSync4k*1000: 231.349ms
benchCore*1000: 1.900s
benchConsole*1000: 6.654s
benchSonicBuf*1000: 479.036ms
benchSonicSyncBuf*1000: 6.643s
benchSonic4kBuf*1000: 479.021ms
benchSonicSync4kBuf*1000: 338.352ms
benchCoreBuf*1000: 647.737ms
benchSonic*1000: 402.104ms
benchSonicSync*1000: 6.381s
benchSonic4k*1000: 411.276ms
benchSonicSync4k*1000: 230.133ms
benchCore*1000: 1.879s
benchConsole*1000: 7.073s
benchSonicBuf*1000: 488.614ms
benchSonicSyncBuf*1000: 6.739s
benchSonic4kBuf*1000: 468.49ms
benchSonicSync4kBuf*1000: 346.683ms
benchCoreBuf*1000: 657.002ms

node 18.16.0

benchSonic*1000: 435.906ms
benchSonicSync*1000: 6.272s
benchSonic4k*1000: 437.36ms
benchSonicSync4k*1000: 252.272ms
benchCore*1000: 2.252s
benchConsole*1000: 4.260s
benchSonicBuf*1000: 799.471ms
benchSonicSyncBuf*1000: 6.771s
benchSonic4kBuf*1000: 796.44ms
benchSonicSync4kBuf*1000: 645.533ms
benchCoreBuf*1000: 893.629ms
benchSonic*1000: 418.545ms
benchSonicSync*1000: 6.380s
benchSonic4k*1000: 417.882ms
benchSonicSync4k*1000: 245.55ms
benchCore*1000: 2.214s
benchConsole*1000: 5.374s
benchSonicBuf*1000: 787.246ms
benchSonicSyncBuf*1000: 6.849s
benchSonic4kBuf*1000: 822ms
benchSonicSync4kBuf*1000: 638.927ms
benchCoreBuf*1000: 888.464ms
AVVS commented 1 year ago

@mmarchini please take a look if the changes are sufficient based on your comments

mcollina commented 1 year ago

wow, quite a speedup also on writing things compared to core.

AVVS commented 1 year ago

Would be interesting to bench this with io_uring support as fs operations should be much faster on linux

Some weirdness is going on with win/node-14 combo during dep install

mcollina commented 1 year ago

Some weirdness is going on with win/node-14 combo during dep install

GHA is broken on that platform, just exclude it in the workflow file.

mmarchini commented 1 year ago

I'll try to take a look over the weekend :)

Would be interesting to bench this with io_uring support as fs operations should be much faster on linux

I have a Linux machine I can try it on. Will report back with the results

AVVS commented 1 year ago

anything I should do with this to move it forward? :)

mcollina commented 1 year ago

@mmarchini could you take a quick look? I would prefer if this did not break you. Otherwise I'm ok to ship it in the coming weeks.