bheisler / criterion.rs

Statistics-driven benchmarking library for Rust
Apache License 2.0
4.59k stars 305 forks source link

allow --save-baseline to be used with --baseline #201

Open BurntSushi opened 6 years ago

BurntSushi commented 6 years ago

In my benchmarking work flow, I like to try and record every run I make with a specific name so that I can keep track of my results in a structured manner. What this means is that I don't really want to use criterion's automatic baseline support---I'd like to do it myself explicitly.

For the most part, this works. I can save a baseline and then compare against it. However, I cannot both save a baseline and compare against a prior baseline. The purpose of doing this is that while I am interesting in the comparison with the prior baseline, I may also be interesting in a subsequent comparison with the current benchmark results.

From looking at the directory tree, I suspect I could probably fudge this copying the base directory in each benchmark directory and giving it a name corresponding to my desire new baseline. But I'm sure that isn't always guaranteed to work. :-)

Another possible solution to this problem is to permit the comparison of two benchmark outputs that have already been computed. That way, I'd do two benchmark runs to record their output, and then ask criterion to simply compare those outputs but not run any new benchmarks. My feeling though is that this doesn't quite fit with how Criterion works in the context of a benchmark harness, but figured I'd throw it out there.

Thanks!

bheisler commented 6 years ago

Hey, thanks for the suggestion!

I just want to be sure I understand - you want to save one baseline, and also compare against another? I think that ought to be possible. It should be a matter of updating the baseline logic in analysis/mod.rs and changing the CLI code in lib.rs. I would suggest testing such a change thoroughly - something in the back of my mind bugs me about this idea but I can't figure out what it is.

BurntSushi commented 6 years ago

@bheisler Yeah that sounds about right!

something in the back of my mind bugs me about this idea but I can't figure out what it is.

Heh. I've been there. If experience is any guide, that instinct is more often right then wrong. :)

rgreenblatt commented 3 years ago

Hmm, I might implement this if there isn't currently another way to do this. Any updates?

bheisler commented 3 years ago

I still don't really have a design for this feature that I like, and I haven't considered it a priority. I'd be open to pull requests, but would ask for people to talk to me about a design first.

irevoire commented 3 years ago

Hello @bheisler, I would like to implement this in the following weeks. Are you still open to pull request? And is the initial idea of @BurntSushi still ok to you?

It should be a matter of updating the baseline logic in analysis/mod.rs and changing the CLI code in lib.rs.

Another idea would be to create a new binary doing only the comparison between baseline like critcmp but by doing it in criterion we would be able to generates the html report.

bheisler commented 3 years ago

Yeah, if you know how you want this feature to work you're welcome to send a pull request.

irevoire commented 3 years ago

My first guess without realizing the feature did not exists was to run a cargo bench --benches xxxx -- --baseline yyyy --load-baseline zzzz Baseline yyyy and zzzz being already computed baseline in the previous run of the benchmarks I wanted to compare together without running the benchmarks at all.

But the more I think about this the more I think we should provide another light binary for this specific use case.

BurntSushi commented 3 years ago

But the more I think about this the more I think we should provide another light binary for this specific use case.

That's what critcmp is I think. It's not an official part of the Criterion project, and it depends on an undocumented data format, but it's been fine in practice.

irevoire commented 3 years ago

Yeah but I would like to have the fancy html report 🥺