Closed Aslemammad closed 1 year ago
Path | Size | Loading time (3g) | Running time (snapdragon) | Total time |
---|---|---|---|---|
dist/index.js | 2.28 KB (-0.73% 🔽) | 46 ms (-0.73% 🔽) | 31 ms (+87.32% 🔺) | 77 ms |
dist/index.cjs | 2.55 KB (-0.27% 🔽) | 52 ms (-0.27% 🔽) | 35 ms (+62.53% 🔺) | 86 ms |
@3rd could you check this one if it produces more accurate results now?
These are the results for me, it improved, but it's still not as accurate as mitata is.
There's always the 60-70ns overhead. Tried removing try catch context and if elses, but it didn't work, the only thing worked is fallbacking to performance.now and making hrtime optional.
@evanwashere Any guidance on this issue?
@Aslemammad great change, it's getting way closer now!
tseep should indeed do about 238,095,238 ops / second, but measuring that is less important than the difference between the function times, and it did improve there, the remaining error is about 10x (looking at tseep vs mitt)
This is what I get:
@3rd Thank you so much for your helpful insights! I tried multiple solutions but didn't work, let's stick with this one and see if anyone can make it even more faster later.
@Aslemammad I had time to do some exploration and I think batching is the main way to stop internal overhead / optimizations from messing with the stats, and then adding other mini-tricks on top. Just adding support for async handlers or before/after hooks will mess the stats up, even if it's just some checks, I had to split the task running function intro two different internal handlers, so if the task is sync there will be no async overhead.
This is my current version: https://github.com/3rd/benchmate/blob/master/src/index.ts#L92
And I'm getting pretty close to mitata.
And for sure the direct call being faster than tseep.emit is the correct answer, which I'm now getting consistently.
I probably have a lot of bugs, and I'm sure there's lots of precision loss / missed optimizations, but given a large enough iteration count and batch size it doesn't seem like a bad start at all.
Also noticed that the differences between using performance.now
and hrtime
are pretty small.
Honestly, this made my day. Thank you so much @3rd! Great job
I see, yea, I expected that the async/await is creating so much overhead since it's expected, but couldn't remove them!
If you finalized your experiments (which work completely ok now), I'd like to see you contribute them to tinybench with not so much architecture change and any breaking change, since they'd be really helpful. The current architecture is being used in some production apps and breaking it would not look good. But I'm sure you'll find a way to make that without those issues. I'm completely open to getting those changes in tinybench.
Great job @3rd.
@Aslemammad thanks, you too! Yep, I can make a PR to port the async/sync split, but for the batching I think the benchmark time approach can't work, only the sample size, and the batch size needs to be configured, maybe it could be defaulted to something like iterations / 100, but the users could still have to adjust it manually for long running tasks, or for those with less than 100 iterations. I'll keep looking into it, planning on making a new jsperf-like app this week.
@3rd I'm excited to see the PR and the app!
Resolves #42 and #41