Open bjorn3 opened 4 years ago
Results after #918:
There are still regressions compared to cg_llvm, but most of the incremental compilation times have improved compared to cg_llvm.
New patch for the collector:
diff --git a/collector/src/bin/rustc-perf-collector/execute.rs b/collector/src/bin/rustc-perf-collector/execute.rs
index 9aa2cc48..1caecc8c 100644
--- a/collector/src/bin/rustc-perf-collector/execute.rs
+++ b/collector/src/bin/rustc-perf-collector/execute.rs
@@ -203,13 +203,20 @@ impl<'a> CargoProcess<'a> {
fn run_rustc(&mut self) -> anyhow::Result<()> {
loop {
let mut cmd = self.base_command(self.cwd, "rustc");
+ cmd.env("RUSTFLAGS", "-Cpanic=abort \
+ -Zcodegen-backend=/home/bjorn/Documenten/cg_clif/target/release/librustc_codegen_cranelift.so \
+ --sysroot /home/bjorn/Documenten/cg_clif/build_sysroot/sysroot");
+ cmd.arg("--target").arg("x86_64-unknown-linux-gnu");
cmd.arg("-p").arg(self.get_pkgid(self.cwd));
+ cmd.env("CG_CLIF_INCR_CACHE", "1");
match self.build_kind {
BuildKind::Check => {
+ return Ok(());
cmd.arg("--profile").arg("check");
}
BuildKind::Debug => {}
BuildKind::Opt => {
+ return Ok(());
cmd.arg("--release");
}
}
Edit: Flipped the default of incr caching of object files in 198037119520d8cccafdc1fd511164c63d741aed, so the old patch is correct again.
A lot of the reds are caused by the linker taking much more time. (Up to 90%!)
5d516f9e118d6527947ca5deb3d76bbc4fa0f8a1 is a 20%-50% improvement on the coercions-debug
benchmark. Overall it is a ~2% improvement.
Current results with lld:
Although there are still regressions, they are almost entirely found in the tiny stress-test benchmarks. Most real-world benchmarks are seeing fantastic improvements!
Wonderful work, @bjorn3!
There are a few places where a non stress-test benchmark regresses a few percent in one of the incremental benchmarks. Other than that many stress-test benchmarks regress because of slower linking. Improving this will benefit all other executable benchmarks too. For example the helloworld-debug
regression can be completely explained by longer linking times. In fact the codegen part is faster for cg_clif.
Reran the benchmarks with firefox and vscode closed. Now only regression-31157-debug patched incremental is a significant regression:
With such huge improvements, how much work would you say is left for MVP?
There are still missing features as mentioned in https://hackmd.io/@bjorn3/HJL5ryFS8. I don't know how long it will take to implement most of them. Some are hard, while others are less hard.
Are there any recent rustc-perf runs? I'm especially curious about the JIT mode.
Not recently. Don't expect the JIT mode to be faster than AOT compilation. The JIT mode currently doesn't support incremental compilation, which makes it slower.
Here is the latest.. Using commit df7f02072b64712e5322ea70675135cb1e20bf80
CG_CLIF
diff --git a/collector/src/execute.rs b/collector/src/execute.rs
index d816eaaf..ec71984f 100644
--- a/collector/src/execute.rs
+++ b/collector/src/execute.rs
@@ -399,14 +399,21 @@ impl<'a> CargoProcess<'a> {
};
let mut cmd = self.base_command(self.cwd, subcommand);
+ cmd.env(
+ "RUSTFLAGS",
+ "-Zcodegen-backend=/home/jasew/workspace/rustc_codegen_cranelift/build/lib/librustc_codegen_cranelift.so",
+ );
+ cmd.arg("--target").arg("x86_64-unknown-linux-gnu");
cmd.arg("-p").arg(self.get_pkgid(self.cwd)?);
match self.profile_kind {
ProfileKind::Check => {
+ return Ok(());
cmd.arg("--profile").arg("check");
}
ProfileKind::Debug => {}
ProfileKind::Doc => {}
ProfileKind::Opt => {
+ return Ok(());
cmd.arg("--release");
}
}
LLVM
diff --git a/collector/src/execute.rs b/collector/src/execute.rs
index d816eaaf..ca34d0a3 100644
--- a/collector/src/execute.rs
+++ b/collector/src/execute.rs
@@ -399,14 +399,17 @@ impl<'a> CargoProcess<'a> {
};
let mut cmd = self.base_command(self.cwd, subcommand);
+ cmd.arg("-j1");
cmd.arg("-p").arg(self.get_pkgid(self.cwd)?);
match self.profile_kind {
ProfileKind::Check => {
+ return Ok(());
cmd.arg("--profile").arg("check");
}
ProfileKind::Debug => {}
ProfileKind::Doc => {}
ProfileKind::Opt => {
+ return Ok(());
cmd.arg("--release");
}
}
Notes:
Processor AMD Ryzen 9 5950X 16-Core Processor 3.40 GHz Installed RAM 32.0 GB
I only ran the
debug
benchmarks, ascheck
should be identical andrelease
will definitively be faster because of much less optimizations by cg_clif.Except for some stress-tests the
clean
andbaseline incremental
results are quite positive (~10-60% improvement, often ~40%) Forclean incremental
the results are much worse (easily ~200%), as compiled object files are not stored in the incremental cache (#760) Forpatched incremental
the results are very mixed. Sometimes the difference is just a little bit less thanclean incremental
, while in other cases it is up to ~70% faster than cg_llvm.packed-simd
failed due to a verifier error. Edit(2020-03-11): Opened #919.Edit(2020-03-11): Fixed in #916.hyper-2
failed due to unsized locals not being implemented (used forimpl FnOnce for Box<FnOnce>
).style-servo
failed due to running out of disk space.Patch for rustc-perf
```diff diff --git a/collector/src/bin/rustc-perf-collector/execute.rs b/collector/src/bin/rustc-perf-collector/execute.rs index 9aa2cc48..4f577183 100644 --- a/collector/src/bin/rustc-perf-collector/execute.rs +++ b/collector/src/bin/rustc-perf-collector/execute.rs @@ -203,13 +203,19 @@ impl<'a> CargoProcess<'a> { fn run_rustc(&mut self) -> anyhow::Result<()> { loop { let mut cmd = self.base_command(self.cwd, "rustc"); + cmd.env("RUSTFLAGS", "-Cpanic=abort \ + -Zcodegen-backend=~/Documents/cg_clif/target/release/librustc_codegen_cranelift.so \ + --sysroot ~/Documents/cg_clif/build_sysroot/sysroot"); + cmd.arg("--target").arg("x86_64-unknown-linux-gnu"); cmd.arg("-p").arg(self.get_pkgid(self.cwd)); match self.build_kind { BuildKind::Check => { + return Ok(()); cmd.arg("--profile").arg("check"); } BuildKind::Debug => {} BuildKind::Opt => { + return Ok(()); cmd.arg("--release"); } } ```Results
![image](https://user-images.githubusercontent.com/17426603/73138656-9db54a80-4065-11ea-8e1f-efc5f06f9def.png)