Since the README file mentions a lot of performance-oriented things, I decided to test one compiler optimization - Profile-Guided Optimization (PGO) on genson-rs. I already tested it on various projects with positive results (you can find all benchmarks here: https://github.com/zamazan4ik/awesome-pgo), so here are the benchmark results for genson-rs.
Test environment
Fedora 39
Linux kernel 6.8.7
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.78
The project version: the latest for now from the main branch on commit 67afe6d3ad8d10affb65b251694ca7b52b978769
Disabled Turbo boost
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. Release bench result I got with the taskset -c 0 cargo bench command. The PGO training phase is done with taskset -c 0 cargo pgo bench, PGO optimization phase - with taskset -c 0 cargo pgo optimize bench.
All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee). taskset -c 0 is used for reducing OS scheduler "noise".
According to the results, PGO measurably improves the tool's performance at least in the benchmark above.
Further steps
I can suggest the following action points:
Perform more PGO benchmarks with other test files. If it shows improvements - add a note to the documentation (README file?) about possible improvements in the tool's performance with PGO.
Optimize prebuilt binaries with PGO (if any). As a training set, you can try to gather multiple real-life files, train PGO on them, and deliver pre-PGO-optimized binaries to the users.
Consider enabling Link-Time Optimization (LTO) for the tool. It can help with optimizing performance and reducing the binary size.
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
I would be happy to answer your questions about PGO.
P.S. I created the Issue since Discussions are disabled for the repo. Since it's not the issue but an improvement idea, probably Discussions is a better place to discuss such things.
Hi!
Since the README file mentions a lot of performance-oriented things, I decided to test one compiler optimization - Profile-Guided Optimization (PGO) on
genson-rs
. I already tested it on various projects with positive results (you can find all benchmarks here: https://github.com/zamazan4ik/awesome-pgo), so here are the benchmark results forgenson-rs
.Test environment
main
branch on commit67afe6d3ad8d10affb65b251694ca7b52b978769
Benchmark
For benchmark purposes, I use built-in into the project benchmarks. For PGO optimization I use cargo-pgo tool. Release bench result I got with the
taskset -c 0 cargo bench
command. The PGO training phase is done withtaskset -c 0 cargo pgo bench
, PGO optimization phase - withtaskset -c 0 cargo pgo optimize bench
.All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).
taskset -c 0
is used for reducing OS scheduler "noise".Results
I got the following results:
According to the results, PGO measurably improves the tool's performance at least in the benchmark above.
Further steps
I can suggest the following action points:
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
I would be happy to answer your questions about PGO.
P.S. I created the Issue since Discussions are disabled for the repo. Since it's not the issue but an improvement idea, probably Discussions is a better place to discuss such things.