Open plafer opened 8 months ago
Hi. I don't familiar with winterfell
at all, but I was having the same problem and found this issue on google.
I don't know whether my (rather wacky) solution would help there, but I managed to run a criterion
benchmark with stable rust on a private attributes concealing it as a default #[test]
like that:
fn internal_function() { ... }
#[cfg(test)]
pub mod tests {
use super::*;
use criterion::{black_box, Criterion};
use std::time::Duration;
fn benchmarks(c: &mut Criterion) {
c.bench_function("bench_internal_function", |b| {
b.iter(|| internal_function())
});
}
#[test]
fn bench() {
// `cargo test --profile bench -j1 -- --nocapture bench -- <benchmark_filter>
// This workaround allows benchmarking private interfaces with `criterion` in stable rust.
// Collect cli arguments
let args: Vec<String> = std::env::args().collect();
// Interpret sequence of args `[ "...bench", "--", "[filter]" ]` as a trigger and extract `filter`
let filter = args
.windows(3)
.filter(|p| p.len() >= 2 && p[0].ends_with("bench") && p[1] == "--")
.map(|s| s.get(2).unwrap_or(&"".to_string()).clone())
.next();
// Return successfully if trigger wasn't found
let filter = match filter {
None => return,
Some(f) => f,
};
let profile_time = args
.windows(2)
.filter(|p| p.len() == 2 && p[0] == "--profile-time")
.map(|s| s[1].as_str())
.next();
// TODO: Adopt `Criterion::configure_from` when it lands upstream
let mut c = Criterion::default()
.with_output_color(true)
.without_plots()
.with_filter(&filter)
.warm_up_time(Duration::from_secs_f32(0.5))
.measurement_time(Duration::from_secs_f32(0.5))
.profile_time(profile_time.map(|s| Duration::from_secs_f32(s.parse().unwrap())));
benchmarks(&mut c);
Criterion::default().final_summary();
// TODO: Move this function to a macro
}
}
We currently use
criterion
for benchmarking. The main limitation ofcriterion
is that only a crate's public API can be tested.In contrast, the standard benchmark harness (provided by the unstable
test
crate) allows you to mark any function with#[bench]
(as you would a#[test]
) - allowing you to benchmark internal functions. I believe this would be so valuable that it would outweigh the cons of usingtest
. While working on #247, we were interested in knowing the specific timing of internal functions (e.g. here). I ended up manually inserting timing statements in the code, running the right test, and inspecting the output. IMO, this shows how limiting usingcriterion
really is.Note that
criterion
has a feature to allow benchmarking internal functions similar to#[bench]
, which depends on thecustom_test_framework
nightly feature . However, the tracking issue forcustom_test_framework
has been closed due to inactivity. So I personally would stay away from it if/until that changes.Pros of using
test
overcriterion
Cons of using
test
overcriterion
criterion
gives you the performance increase over the last run of the benchmarkcriterion
pre-populates the cache before running a benchmark (not sure whether or nottest
does that too, but at least it's not advertised in the benchmark output)test
is still unstableSummary
Although there are more cons than pros, I believe the ability to benchmark internal functions far outweighs any of the cons (as explained earlier). We can deal with the dependency on nightly by adding a
"benchmarking"
feature, which controls the use ofnightly
.