tinylibs / tinybench

🔎 A simple, tiny and lightweight benchmarking library!
MIT License
1.73k stars 36 forks source link

fix: accurate results and hooks fix #43

Closed Aslemammad closed 1 year ago

Aslemammad commented 1 year ago

Resolves #42 and #41

github-actions[bot] commented 1 year ago

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
dist/index.js 2.28 KB (-0.73% 🔽) 46 ms (-0.73% 🔽) 31 ms (+87.32% 🔺) 77 ms
dist/index.cjs 2.55 KB (-0.27% 🔽) 52 ms (-0.27% 🔽) 35 ms (+62.53% 🔺) 86 ms
Aslemammad commented 1 year ago

@3rd could you check this one if it produces more accurate results now?

Aslemammad commented 1 year ago

These are the results for me, it improved, but it's still not as accurate as mitata is.

image There's always the 60-70ns overhead. Tried removing try catch context and if elses, but it didn't work, the only thing worked is fallbacking to performance.now and making hrtime optional.

@evanwashere Any guidance on this issue?

3rd commented 1 year ago

@Aslemammad great change, it's getting way closer now!

tseep should indeed do about 238,095,238 ops / second, but measuring that is less important than the difference between the function times, and it did improve there, the remaining error is about 10x (looking at tseep vs mitt)

This is what I get: image

Aslemammad commented 1 year ago

@3rd Thank you so much for your helpful insights! I tried multiple solutions but didn't work, let's stick with this one and see if anyone can make it even more faster later.

3rd commented 1 year ago

@Aslemammad I had time to do some exploration and I think batching is the main way to stop internal overhead / optimizations from messing with the stats, and then adding other mini-tricks on top. Just adding support for async handlers or before/after hooks will mess the stats up, even if it's just some checks, I had to split the task running function intro two different internal handlers, so if the task is sync there will be no async overhead.

This is my current version: https://github.com/3rd/benchmate/blob/master/src/index.ts#L92

And I'm getting pretty close to mitata. image image

And for sure the direct call being faster than tseep.emit is the correct answer, which I'm now getting consistently.

I probably have a lot of bugs, and I'm sure there's lots of precision loss / missed optimizations, but given a large enough iteration count and batch size it doesn't seem like a bad start at all.

Also noticed that the differences between using performance.now and hrtime are pretty small.

Aslemammad commented 1 year ago

Honestly, this made my day. Thank you so much @3rd! Great job

I see, yea, I expected that the async/await is creating so much overhead since it's expected, but couldn't remove them!

If you finalized your experiments (which work completely ok now), I'd like to see you contribute them to tinybench with not so much architecture change and any breaking change, since they'd be really helpful. The current architecture is being used in some production apps and breaking it would not look good. But I'm sure you'll find a way to make that without those issues. I'm completely open to getting those changes in tinybench.

Great job @3rd.

3rd commented 1 year ago

@Aslemammad thanks, you too! Yep, I can make a PR to port the async/sync split, but for the batching I think the benchmark time approach can't work, only the sample size, and the batch size needs to be configured, maybe it could be defaulted to something like iterations / 100, but the users could still have to adjust it manually for long running tasks, or for those with less than 100 iterations. I'll keep looking into it, planning on making a new jsperf-like app this week.

Aslemammad commented 1 year ago

@3rd I'm excited to see the PR and the app!