Open Noratrieb opened 5 months ago
This also annoys me. I'll try to profile it in a bit more detail.
Seems like about a half of that duration on Linux is spent inside strip_debug
, where we invoke strip
on the compiler artifacts. BFD is unfortunately quite slow here, and takes about 1.2s on my system (so about half the time) to strip the compiler binary. I'll try to investigate if we can get rid of it.
Edit: this only happens if you set rust.debuginfo-level = 0
, see this.
Apart from stripping, the rest is https://github.com/rust-lang/rust/pull/126467 + a bunch of cargo and git invocations, but that was already clear from the flamechart.
The git
calls are highly repetitive:
177 counts
( 1) 42 (23.7%, 23.7%): execve("/usr/bin/git", ["git", "rev-parse", "HEAD"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 2) 29 (16.4%, 40.1%): execve("/usr/bin/git", ["git", "rev-parse"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 3) 26 (14.7%, 54.8%): execve("/usr/bin/git", ["git", "log", "-1", "--date=short", "--pretty=format:%cd"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 4) 26 (14.7%, 69.5%): execve("/usr/bin/git", ["git", "rev-parse", "--short=9", "HEAD"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 5) 9 ( 5.1%, 74.6%): execve("/usr/bin/git", ["git", "config", "--local", "--get-regex", "remote\\..*\\.url"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 6) 9 ( 5.1%, 79.7%): execve("/usr/bin/git", ["git", "merge-base", "upstream/master", "HEAD"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 7) 9 ( 5.1%, 84.7%): execve("/usr/bin/git", ["git", "rev-list", "--author=bors@rust-lang.org", "-n1", "--first-parent", "02c7a5921e3de5c2b3ecb2e0082c1daf"..., "--", "/home/ben/rust/src/llvm-project", "/home/ben/rust/src/bootstrap/dow"..., "/home/ben/rust/src/version"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 8) 9 ( 5.1%, 89.8%): execve("/usr/bin/git", ["git", "rev-parse", "upstream/master"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 9) 4 ( 2.3%, 92.1%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "library/stdarch"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 10) 2 ( 1.1%, 93.2%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "library/backtrace"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 11) 2 ( 1.1%, 94.4%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/book"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 12) 2 ( 1.1%, 95.5%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/tools/cargo"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 13) 1 ( 0.6%, 96.0%): execve("/usr/bin/git", ["git", "config", "--file", "/home/ben/rust/.gitmodules", "--get-regexp", "path"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 14) 1 ( 0.6%, 96.6%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/edition-guide"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 15) 1 ( 0.6%, 97.2%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/embedded-book"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 16) 1 ( 0.6%, 97.7%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/nomicon"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 17) 1 ( 0.6%, 98.3%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/reference"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 18) 1 ( 0.6%, 98.9%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/doc/rust-by-example"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 19) 1 ( 0.6%, 99.4%): execve("/usr/bin/git", ["git", "ls-tree", "HEAD", "src/tools/rustc-perf"], 0x7ffdf49142d8 /* 46 vars */) = 0
( 20) 1 ( 0.6%,100.0%): execve("/usr/bin/git", ["git", "rev-parse", "--show-cdup"], 0x7ffdf49142d8 /* 46 vars */) = 0
And based on the flamegraph above, most of the runtime of git is in process startup. Caching these somehow seems sensible.
That's a good point! Onur has an open PR that centralizes the git invocations in bootstrap, once that's merged, it will be easier to cache them, I'll take a look at it.
Also, running git gc
can help a lot, from time to time. Maybe bootstrap could do it periodically.
Seems like about a half of that duration on Linux is spent inside strip_debug
Why are we re-stripping when things are cached?
It happens unconditionally, so it is performed even if the build itself is cached. But I forgot to write here that this only seems to be happening with relatively old versions of strip
, I think that for @lqd with a newer strip version it was way faster.
Also these numbers are dependent on one's config debug options, linker/strip version, etc, and no stripping happens when some debuginfo is requested for the compiler or libstd; that is not a universal overhead. It doesn't look like a noticeable part of the OP's flamegraph and certainly not half of the time.
That stripping should also only be applied to dist artifacts, there's no need to do that locally. Unfortunately, in the current state of bootstrap, it's unexpectedly hard to achieve, both Jakub and I tried and it always caused issues elsewhere, and hopefully that can be fixed in the future.
A fully cached
./x run --stage 1 miri
takes 2.35s. This is really slow, and added to most bootstrap invocations.Bootstrap is doing a lot of work, and it's nontrivial to optimize all of that, but it leads to these slow times. There's not gonna be a silver bullet, but I believe this is something we should at least consider and attempt to improve if possible.
One particularly slow aspect is updating the submodules. Another thing I found in my profile is that it spends 10% of the time parsing C++. I do not think every bootstrap invocation should involve parsing C++ code. There's also lots of overhead from cargo, maybe there are unnecessary cargo invocations?
I've attached the flamegraph from
perf record --call-graph=dwarf ./x run --stage 1 miri
.