Kobzol / cargo-pgo

Cargo subcommand for optimizing Rust binaries/libraries with PGO and BOLT.
MIT License
527 stars 9 forks source link
bolt cargo pgo rust

cargo-pgo Build Status Latest Version

Cargo subcommand that makes it easier to use PGO and BOLT to optimize Rust binaries.

For an example on how to use cargo-pgo to optimize a binary on GitHub Actions CI, see this workflow.

Installation

$ cargo install cargo-pgo

You will also need the llvm-profdata binary for PGO and llvm-bolt and merge-fdata binaries for BOLT.

You can install the PGO helper binary by adding the llvm-tools-preview component to your toolchain with rustup:

$ rustup component add llvm-tools-preview

For BOLT, it is highly recommended to use Docker. See below for BOLT installation guide.

BOLT and Docker support is currently experimental.

Docker

To use latest cargo-pgo with Docker, you need to build the image first:

git clone https://github.com/Kobzol/cargo-pgo.git && cd cargo-pgo
docker build -t cargo-pgo .

Then run this in your project directory to create a container:

docker run -v $(pwd):/workdir --rm -it cargo-pgo

In the container, you can run cargo-pgo as you would on your system.

Note that with --rm argument, the container will be removed after you exit.

PGO/BOLT workflow

It is important to understand the workflow of using feedback-directed optimizations. Put simply, it consists of three general steps:

1) Build binary with instrumentation

Example

Example usage of the tool

Usage

Before you start to optimize your binaries, you should first check if your environment is set up correctly, at least for PGO (BOLT is more complicated). You can do that using the info command:

$ cargo pgo info

PGO

cargo-pgo provides subcommands that wrap common Cargo commands. It will automatically add --release to wrapped commands where it is applicable, since it doesn't really make sense to perform PGO on debug builds.

Generating the profiles

First, you need to generate the PGO profiles by performing an instrumented build. You can currently do that in several ways. The most generic command for creating an instrumented artifact is cargo pgo instrument:

$ cargo pgo instrument [<command>] -- [cargo-args]

The command specifies what command will be executed by cargo. It is optional and by default it is set to build. You can pass additional arguments for cargo after --.

There are several ways of producing the profiles:

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo optimize to build an optimized version of your binary.

If you want, you can also pass a command to cargo pgo optimize to e.g. run PGO-optimized benchmarks or tests:

$ cargo pgo optimize bench
$ cargo pgo optimize test

Analyzing PGO profiles

You can analyze gathered PGO profiles using the llvm-profdata binary:

$ llvm-profdata show <profile>.profdata

BOLT

Using BOLT with cargo-pgo is similar to using PGO, however you either have to build BOLT manually or download it from the GitHub releases archive (for LLVM 16+). Support for BOLT is currently in an experimental stage.

BOLT is not supported directly by rustc, so the instrumentation and optimization commands are not directly applied to binaries built by rustc. Instead, cargo-pgo creates additional binaries that you have to use for gathering profiles and executing the optimized code.

Generating the profiles

First, you need to generate the BOLT profiles. To do that, execute the following command:

$ cargo pgo bolt build

The instrumented binary will be located at <target-dir>/<target-triple>/release/<binary-name>-bolt-instrumented. Execute it on several workloads to gather as much data as possible.

Note that for BOLT, the profile gathering step is optional. You can also simply run the optimization step (see below) without any profiles, although it will probably not have a large effect.

Building an optimized binary

Once you have generated some profiles, you can execute cargo pgo bolt optimize to build an optimized version of your binary. The optimized binary will be named <binary-name>-bolt-optimized.

BOLT + PGO

Yes, BOLT and PGO can even be combined :) To do that, you should first generate PGO profiles and then use BOLT on already PGO optimized binaries. You can do that using the --with-pgo flag:

# Build PGO instrumented binary
$ cargo pgo build
# Run binary to gather PGO profiles
$ ./target/.../<binary>
# Build BOLT instrumented binary using PGO profiles
$ cargo pgo bolt build --with-pgo
# Run binary to gather BOLT profiles
$ ./target/.../<binary>-bolt-instrumented
# Optimize a PGO-optimized binary with BOLT
$ cargo pgo bolt optimize --with-pgo

Do not strip symbols from your release binary when using BOLT! If you do it, you might encounter linker errors.

BOLT installation

Here's a short guide how to compile LLVM with BOLT manually. You will need a recent compiler, CMake and ninja.

Note: LLVM BOLT is slowly getting into package repositories, although it's not fully working out of the box yet. You can find more details here if you're interested.

1) Download LLVM

    $ git clone https://github.com/llvm/llvm-project
    $ cd llvm-project 

2) (Optional) Checkout a stable version, at least 14.0.0

    $ git checkout llvmorg-14.0.5

Note that BOLT is being actively fixed, so a trunk version of LLVM might actually work better. 3) Prepare the build

    $ cmake -S llvm -B build -G Ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_INSTALL_PREFIX=${PWD}/llvm-install \
      -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt;bolt"

4) Compile LLVM with BOLT

    $ cd build
    $ ninja
    $ ninja install 
The built files should be located at `<llvm-dir>/llvm-install/bin`. You should add this directory
to `$PATH` to make BOLT usable with `cargo-pgo`.

Related work

License

MIT