lqd / rustc-benchmarking-data

8 stars 0 forks source link

rustc-benchmarking-data

This repository contains raw data, and summaries, from rustc benchmarks on many crates from crates.io.

Crate selection: while we want an healthy mix of binaries and libraries, at the time of writing, the crates benchmarked here were selected by popularity, and then filtered: the 1000 most popular crates by download count were gathered.

A lot of those lacked Cargo.lock files, a required file when benchmarking with the rustc-perf collector, so they were regenerated with cargo generate-lockfile.

They were then filtered:

The current list of these ignored crates is in the summaries directory (here). Some of these require manual changes so that they can be profiled with the collector, and this will be done in the future.

The 800 or so remaining crates were then benchmarked in different rounds, whose raw data is present in the results subfolders:

  1. a cachegrind annotated profile of check builds
  2. the result of cargo llvm-lines on the leaf crates
  3. a DHAT profile
  4. an idea of each crate's size: the rust LOC counts, as reported by tokei -t=Rust (not including markdown, i.e. doctests)
  5. a cachegrind annotated profile of debug builds
  6. the result of cargo llvm-lines on release builds (we expected it to provide information about the crate graph but it appears not, but the data is still different in --release)
  7. check build schedules, as generated by cargo -Ztimings, under -j1
  8. debug build schedules, as generated by cargo -Ztimings, under -j1
  9. release build schedules, as generated by cargo -Ztimings, under -j1
  10. a cachegrind annotated profile of release builds
  11. check build schedules, as generated by cargo -Ztimings, under -j8
  12. debug build schedules, as generated by cargo -Ztimings, under -j8
  13. release build schedules, as generated by cargo -Ztimings, under -j8
  14. rustc -Zself-profile data for check builds (default parallelism)
  15. rustc -Zself-profile data for debug builds (default parallelism)
  16. rustc -Zself-profile data for release builds (default parallelism)
  17. rustc -Ztime-passes data for check builds (default parallelism)
  18. rustc -Ztime-passes data for debug builds (default parallelism)
  19. rustc -Ztime-passes data for release builds (default parallelism)

(The thought behind the -j1 builds in rounds 7-9 is to have an idea of a single-thread schedule, and easily gather clean build data of every single dependency in case we'd like to experiment with different cargo scheduling algorithms. Comparing those with their -j8 counterparts shows how massive of an improvement codegen parallelization was.)

Note: the self-profiler data was gathered via the perf collector. It normally collects all query keys, but that skews the profiles. This was disabled for these rounds 14-16. The data was then processed with the summarize and flamegraph tools from measureme.

Each cachegrind round is summarized (the simple tool that does that will also be added here later), the top functions are collected from each profile, and: 1) filtered to focus on rustc functions -- removing the memory allocation/deallocation, elf dynamic loading, LLVM symbols, and rustc metadata encoding (the summary file are suffixed with filtered_ and unfiltered_, and preceding the suffix described next) 2) sorted according to their relative or absolute "retired instructions" stat (here, the summaries are ending with _absolute and _relative)

The tokei line counts are also summarized, and the crates sorted by that number. (The shorter crates are basically empty, and are either just displaying a deprecation notice, or are façades to other crates)

The data was gathered using a combination of local rustc builds and nightly, where it mattered, on an EPYC 7401P (24C/48T):

(Some of the data can be browsed on GH, either via the summaries, or the raw data e.g. to see some cargo timings)