Closed psychon closed 3 years ago
Excellent question about the best pixel values to use. There are possible theoretical problems with all zeros (there may be specialized machine instructions for zeroing memory) although it doesn't look like any of those issues would be at play here. I would certainly be curious to see if different values change the measurement at all, and in general I would probably try to use heterogeneous data, maybe generated randomly with a fixed seed?
Okay, I got a random (ha!) random number generator from wikipedia (LCG). I am not sure whether this makes a difference. It certainly does for Grayscale (+30%), but the others changes might be random "bad luck". I saw similar differences while trying to make the code faster. I guess this could be things like "the inner loop fits into a cache line or not" or something like that. Dunno really, but "random changes" can make this code faster or slower.
Edit: Whoops, the above was generated with the changes from the cairo-big-endian branch merged. Sorry! Edit: New results:
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.
make_image_2160p_Grayscale
time: [35.577 ms 35.582 ms 35.587 ms]
change: [+5.2294% +5.4334% +5.6148%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
make_image_2160p_Rgb time: [35.946 ms 35.951 ms 35.959 ms]
change: [-1.0623% -0.8628% -0.6819%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
Benchmarking make_image_2160p_RgbaSeparate: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.9s, or reduce sample count to 80.
make_image_2160p_RgbaSeparate
time: [59.298 ms 59.307 ms 59.316 ms]
change: [-16.727% -16.661% -16.606%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Benchmarking make_image_2160p_RgbaPremul: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.0s, or reduce sample count to 90.
make_image_2160p_RgbaPremul
time: [50.399 ms 50.403 ms 50.409 ms]
change: [-12.120% -12.106% -12.094%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Updated to take a higher, "more random" byte from the PRNG state. I squashed the following to the latest commit:
diff --git a/piet-cairo/benches/make_image.rs b/piet-cairo/benches/make_image.rs
index 04cd059..fd14c61 100644
--- a/piet-cairo/benches/make_image.rs
+++ b/piet-cairo/benches/make_image.rs
@@ -13,7 +13,8 @@ fn fill_random(data: &mut [u8]) {
let mut next_number = || {
state = (a * state + c) % m;
- state as u8
+ // Take a higher byte since it is more random than the low bytes
+ (state >> 16) as u8
};
data.iter_mut().for_each(|b| *b = next_number());
This time, the addition of the PRNG has a completely negative effect:
$ git checkout criterion^ && cargo bench && echo && echo ---------------- && echo && git checkout criterion && cargo bench
[...]
----------------
Vorherige Position von HEAD war 9915e7a Add a simple criterion benchmark
Zu Branch 'criterion' gewechselt
Compiling piet-cairo v0.4.0 (/tmp/piet/piet-cairo)
Finished bench [optimized] target(s) in 2.66s
Running /tmp/piet/target/release/deps/piet_cairo-dc22b421092584ec
running 2 tests
test text::test::hit_test_empty_string ... ignored
test text::test::test_hit_test_point_complex_1 ... ignored
test result: ok. 0 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out
Running /tmp/piet/target/release/deps/make_image-e4985702849e1350
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.
make_image_2160p_Grayscale
time: [37.717 ms 37.724 ms 37.733 ms]
change: [+6.0752% +6.1030% +6.1349%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
make_image_2160p_Rgb time: [38.221 ms 38.227 ms 38.233 ms]
change: [+6.2194% +6.2764% +6.3194%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Benchmarking make_image_2160p_RgbaSeparate: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.0s, or reduce sample count to 80.
make_image_2160p_RgbaSeparate
time: [60.516 ms 60.523 ms 60.530 ms]
change: [+2.1297% +2.1445% +2.1600%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
Benchmarking make_image_2160p_RgbaPremul: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
make_image_2160p_RgbaPremul
time: [63.016 ms 63.042 ms 63.068 ms]
change: [+25.052% +25.107% +25.163%] (p = 0.00 < 0.05)
Performance has regressed.
This adds a simple benchmark for piet-cairo's make_image function.
Signed-off-by: Uli Schlachter psychon@znc.in
Here is the output when I run this locally (
cargo bench
):And this is what I get when I afterwards run this with the code from #448:
That should settle any worries for performance regressions. :-)
The output talks about html reports that will be disabled in a future version. Here is an excerpt from one of these reports (RgbaSeparate; blue are the measurements for my PR and red are for
master
):Note: I have no mentionable experience with criterion and just "bolted this together" from the docs. The benchmark only feeds this all-zero pixels. No idea if that makes a difference nor what a good alternative would be (is "all
0x80
better? Why?).