This repository contains raw data, and summaries, from rustc benchmarks on many crates from crates.io.
Crate selection: while we want an healthy mix of binaries and libraries, at the time of writing, the crates benchmarked here were selected by popularity, and then filtered: the 1000 most popular crates by download count were gathered.
A lot of those lacked Cargo.lock
files, a required file when benchmarking with the rustc-perf collector, so they were regenerated with cargo generate-lockfile
.
They were then filtered:
The current list of these ignored crates is in the summaries
directory (here). Some of these require manual changes so that they can be profiled with the collector, and this will be done in the future.
The 800 or so remaining crates were then benchmarked in different rounds, whose raw data is present in the results
subfolders:
cargo llvm-lines
on the leaf cratestokei -t=Rust
(not including markdown, i.e. doctests)cargo llvm-lines
on release builds (we expected it to provide information about the crate graph but it appears not, but the data is still different in --release)cargo -Ztimings
, under -j1
cargo -Ztimings
, under -j1
cargo -Ztimings
, under -j1
cargo -Ztimings
, under -j8
cargo -Ztimings
, under -j8
cargo -Ztimings
, under -j8
-Zself-profile
data for check builds (default parallelism)-Zself-profile
data for debug builds (default parallelism)-Zself-profile
data for release builds (default parallelism)-Ztime-passes
data for check builds (default parallelism)-Ztime-passes
data for debug builds (default parallelism)-Ztime-passes
data for release builds (default parallelism)(The thought behind the -j1
builds in rounds 7-9 is to have an idea of a single-thread schedule, and easily gather clean build data of every single dependency in case we'd like to experiment with different cargo scheduling algorithms. Comparing those with their -j8
counterparts shows how massive of an improvement codegen parallelization was.)
Note: the self-profiler data was gathered via the perf collector. It normally collects all query keys, but that skews the profiles. This was disabled for these rounds 14-16. The data was then processed with the summarize
and flamegraph
tools from measureme
.
Each cachegrind round is summarized (the simple tool that does that will also be added here later), the top functions are collected from each profile, and:
1) filtered to focus on rustc functions -- removing the memory allocation/deallocation, elf dynamic loading, LLVM symbols, and rustc metadata encoding (the summary file are suffixed with filtered_
and unfiltered_
, and preceding the suffix described next)
2) sorted according to their relative or absolute "retired instructions" stat (here, the summaries are ending with _absolute
and _relative
)
The tokei
line counts are also summarized, and the crates sorted by that number. (The shorter crates are basically empty, and are either just displaying a deprecation notice, or are façades to other crates)
The data was gathered using a combination of local rustc builds and nightly, where it mattered, on an EPYC 7401P (24C/48T):
cargo 1.60.0-nightly (1c03475 2022-01-25)
e0e70c0c2c4fc8d150a56c181994e3a3b3e9999a
rustc 1.60.0-nightly (21b4a9cfd 2022-01-27)
624fcaa4af8c923b07f6953fef3c5424eefb2ec1
(Some of the data can be browsed on GH, either via the summaries, or the raw data e.g. to see some cargo timings)