Closed Kobzol closed 2 years ago
@Kobzol first of all, thank for the fast reply and pr. Here's what I found. I did:
cargo pgo build -- --bench bench
cargo pgo optimize -- --bench bench
cargo pgo bench --bench bench -- --warm-up-time 1 --measurement-time 3 <some-filter-string>
Unfortunately I failed on step 4. Some attempts like passing one more layer of --
didn't work. It also tell me that cargo pgo bench will store pgo data. But really I want to run the pgo'd benchmark via cargo bench.
Oh, I understood wrong what is your goal. I thought that you want to gather the PGO profiles by running benchmarks, but it seems that you want to benchmark the PGO-optimized benchmark binaries?
Also, it seems like you are using criterion? I thought that you're using the nightly-only built-in benchmark suite.
I tried it with criterion and this command should work:
$ cargo pgo bench -- --bench "benchmark" -- --warm-up-time 1
However, it can only be used currently to gather data from benchmarks, this won't benchmark the optimized benchmark binary. I'll try to think about how to make this use-case available. It probably only makes sense for benchmarks.
Yes exactly without manual file level modifications I don't know any way to use the optimized benchmark binary. That's where I need cargo pgo's help. I initially tried this to see if how munch my benchmarks change if I use pgo. In a way cargo pgo bench is useful for collecting data and for trying out an optimized binary. Maybe something like cargo pgo bench gather
+ cargo pgo bench run
could replace steps 1-4.
I decided to generalize instrumentation even more and allow running any of build
/run
/test
/bench
with instrumentation or optimization. The corresponding PR is located here: https://github.com/Kobzol/cargo-pgo/pull/13. I will merge it soon and then add support for PGO benchmarks on top of that. After that it should be possible both to generate profiles with benchmarks and to benchmark optimized benchmarks :)
Now it should be possible to do what you have tried with the following two commands:
# Gather profiles
$ cargo pgo bench
# Run optimized benchmarks using the gathered profiles
$ cargo pgo optimize bench
Beware that you should also probably run just cargo bench
between these two commands, otherwise the last timings from cargo pgo bench
will be slower, since the benchmarks will be instrumented.
Note that this is not currently supported for BOLT, only for PGO.
@mvtec-bergdolll When you have time, please let me know if the current version of this PR solves your issue :) Thanks.
@Kobzol yes it works. However I wonder if bundling in the optimize stage is ideal. Will it build it anew anytime there is new profiling data? I would prefer having explicit control over when a new optimized binary is built, and an explicit step that only does the benchmark running.
I tried it now and rustc
only tracks the profile filename, so as long as the filename of the merged profile stays the same, it will not recompile the code even if the contents of the profile change. So now it will behave as you want. However, this is actually quite problematic in terms of cargo-pgo
, I will probably change the behaviour so that a new profile is generated each time (or at least if its contents change), otherwise this is a footgun.
In general, I understand your usecase, but I'm not sure if bench
is the right command for it, because it both compiles and runs the benchmarks. It's essentially cargo run
, but for benchmarks. And cargo run
will also first compile your code before running it. I'm not sure if you can run cargo bench
without recompiling the code, for that you probably need to execute the compiled benchmarks binary directly.
There are probably workarounds. Indeed the bench command is somewhat cross with the concepts at play here. How difficult would it be to add an explicit cargo pgo bench <path_to_intstrumented/optimized>
that only runs the benchmark with the harness. Then all the existing pgo + bolt functionality would just work and explicit control over building and gathering would be identical.
I'm not sure exactly how are the benchmark binaries generated, I'll check it and think about possible extensions to cargo-pgo
.
I'll merge this PR, since it's a bit orthogonal to the original issue, and then I'll take a look at your use-case.
Btw thanks for the idea of profile invalidation during compilation, that's a quite serious footgun for this crate (possibly for rustc
) too, I'll try to fix it here and post an issue to rustc
.
Thanks for the proactive help.
This allows gathering profiles from running benchmarks.
Related issue: https://github.com/Kobzol/cargo-pgo/issues/5