Open sabiwara opened 3 months ago
:wave: Heyo @sabiwara good but apparently busy, I'm so sorry I missed this until now :sob: Looking at my calendar this was 2 days before I flew on an 11 day vacation so... that's my excuse for missing it :sweat_smile:
Love the idea and in fact I've thought about this many times myself/wanted it for similar workflows :grin: Always is a bit annoying to write an extra layer of tests that iterate over all them.
So would love to have it! :rocket:
My question would be: How often would we run this? For pure functions we'd only need to run it once (during warmup) technically speaking.
But that doesn't cover "all" use cases. It could be: assert_all_same: true
that runs it after every run (equivalent to after_each
) while assert_all_same: :once
(or :scenario
) would only run it once.
Oof, I just realize due to the way that benchee runs benchmarks: first all from one benchmarking job, then all from the other benchmarking job this would be quite tough to implement (that all results are the same). Although... if we expect them all to be the same each job would only ever need to carry one return value and on each invocation would need to check that it's the same as the one that's already stored. So... that would probably still work :thinking:
We could also run it in the same way as the pre_check
option - it runs all functions once in the beginning. We could implement it as a new option for pre_check
- like pre_check: :assert_all_same
- these would run once before the benchmark even starts running, check they are all the same and if not abort.
What do you think? Which sounds most useful to you? :eyes:
Again, sorry for missing this so long :sweat_smile:
Hi @PragTob 👋
I'm so sorry I missed this until now
No worries!
So would love to have it! 🚀
Awesome! 💜
We could implement it as a new option for pre_check - like pre_check: :assert_all_same - these would run once before the benchmark even starts running, check they are all the same and if not abort.
This sounds exactly like what I was going for, it sounds perfect.
How often would we run this? For pure functions we'd only need to run it once (during warmup) technically speaking.
Exactly, this would only make sense for pure functions (if it isn't consistent with itself, it can't be expected to be so with other implementations). Running just once as a pre-check seems like the right timing.
If we agree on pre_check: :assert_all_same
, I'll try to send a PR next week or so (will also be AFK for a couple of days 😉).
Thanks a lot for considering this idea!
pre_check: :assert_all_same
it is - thanks! Have a wonderful AFK time! :rocket:
Hi @PragTob, how are you doing?
What: an option like
assert_all_same: true
which would run all functions (for each input if usinginputs
) and ensure that results are equal for each input across functions.Context: I've been mostly using Benchee to optimize pure functions in core, which means benchmarking alternative implementations. Failing early if one of my alternative implementation is not equivalent would improve the feedback loop, since I'm usually not writing tests yet during the benchmarking phase (but I often end up adding some
dbg
or manual checks).I could write my own small wrapper, but since it might be a common use case I went for suggesting it in Benchee itself.
I'd be happy to send a PR if there is interest.