Open adamjseitz opened 2 years ago
Here's the complete matrix of build times from ddisasm's variants with different compilers:
Compiler | Arch | 2.3 | master | Change |
---|---|---|---|---|
g++ | IA32 | 3:30.60 | 4:35.08 | +30.6% |
g++ | AMD64 | 3:39.65 | 4:38.03 | +26.6% |
g++ | ARM32 | 5:02.96 | 6:53.59 | +36.5% |
g++ | ARM64 | 3:39.90 | 4:41.73 | +25.1% |
g++ | MIPS32 | 3:46.65 | 4:57.51 | +31.3% |
clang++ | IA32 | 2:54.31 | 5:47.30 | +99.2% |
clang++ | AMD64 | 3:01.58 | 5:53.84 | +94.9% |
clang++ | ARM32 | 4:04.33 | 27:18.06 | +570.4% |
clang++ | ARM64 | 3:39.90 | 5:53.49 | +60.8% |
clang++ | MIPS32 | 3:02.17 | 5:52.07 | +93.3% |
g++
builds suffered a somewhat consistent 30% increase, while clang++
builds ranged anywhere from 60% to 600% percent slower.
Additionally, the number of lines in the generated C++ source for each arch:
Arch | 2.3 | master | Change |
---|---|---|---|
IA32 | 127930 | 148364 | +13.7% |
AMD64 | 137674 | 158020 | +14.8% |
ARM32 | 191274 | 214232 | +12.0% |
ARM64 | 130350 | 151126 | +15.9% |
MIPS32 | 133456 | 155077 | +16.2% |
It seems like across all five arch variants, about 20k lines of C++ were added.
Thanks for the detailed analysis! We must understand the extra work the compiler does with the new extension (compiling strata separately).
Perhaps you can perform diffs before and after so that we can find out what may cause the slow-down.
Git bisect pinpoints the slowdown in compilation time as occuring at 7d648a161.
Souffle output at f8f24c990 (previous commit): https://gist.github.com/ruricolist/4042ff207396eb704d91c40eca4bf16a Souffle output (for ARM) at 7d648a161: https://gist.github.com/ruricolist/a0b212440935ea2fc238963af0f2c4b3
Running clang++ with -ftime-trace
, it appears the extra time is being spent in OptFunction
passes on a single function, souffle::Subroutine_93::run
.
The surface differences between the two versions of the function are minimal (ordering of conjunctions in conditionals), and manually splicing the body of the function from the "fast" version into the "slow" version does not improve compile times.
The latest commit of Souffle, 66b96fd07e7124d0e46bcf7cdc1cc5bf61f54cc3, generates C++ code that takes substantially longer to build than previous Souffle versions. This regression appears to be introduced by https://github.com/souffle-lang/souffle/pull/2215. (As a note, we are excited about the new feature implemented in this PR and want to incorporate it in ddisasm's build system, but discovered this issue while evaluating doing so).
I am seeing this when trying to build the ddisasm project with the latest Souffle, on Ubuntu 20.04 with clang++.
My hardware is a
11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
CPU and 32 GB of RAM.Here are steps to replicate it:
With Souffle 2.3 (the most recent release), as well as commit 62e44313d88dca2d06af8ba8989e82005f497248 (just before https://github.com/souffle-lang/souffle/pull/2215), build times are approximately:
With souffle 66b96fd07e7124d0e46bcf7cdc1cc5bf61f54cc3 (latest master) and 429e1681ce97c8a792f1e4577bf2e9001835daa6 (the merge commit for https://github.com/souffle-lang/souffle/pull/2215):
The ARM32 variant of ddisasm's datalog code, as built above, exhibits a particularly bad case of this regression (4 minutes -> 27 minutes) but other variants also exhibited it to a lesser degree (e.g., ARM64: 3 minutes -> 6 minutes).
The compile time difference is not as extreme using
g++
, either. The ARM32 variant increases from 5m3s to 6m53s usingg++ 9.4.0
.I am going to see if I can replicate this kind of behavior on a more minimal example instead of the complete ddisasm source, but I am not sure that's going to be easy. I at least wanted to report these numbers to illustrate the scale of the problem for a large project.