Open deepfire opened 4 years ago
We'll update the blog post noting that the conclusions are out of date.
Part of the problem is that the benchmark doesn't seem to pin the dependencies, so reproducibility could be an issue..
Maybe if we reimplement this using Nix.. it'd make the benchmark deployment a breese, as well.
What do you think, @christiaanb ?
Yes, that's definitely an issue. Nix would definitely help there, and perhaps will also enable us to include Haskell compile as a benchmark in https://openbenchmarking.org/ so that sites like Phoronix can run Haskell compile bench whenever a new processor is released.
@christiaanb, so I've made some progress -- you can take a look at https://github.com/phoronix-test-suite/test-profiles/commit/cadb82b48f6835678f80d9dd3d91ce83ba8a9bb3
Michael included it despite my PR being marked RFC -- which I've intended to run through you first (but not before I sorted out the minor details..).
Currently, the benchmark consists of just compiling clash-prelude
, clash-lib
and clash-ghc
with GHC-8.10.1.
..and you can already see the summary of preliminary results from the initial, slightly buggy version of the benchmark (https://github.com/phoronix-test-suite/test-profiles/commit/cadb82b48f6835678f80d9dd3d91ce83ba8a9bb3#commitcomment-39685158), which ran three iterations, instead of one, leading to 30 minute+ run times.
I've fixed it since, and Michael already included the new version, so you can see lower numbers already coming up in https://openbenchmarking.org/test/pts/build-clash-1.0.0.
Note, that it doesn't include:
Last, but not least -- my changes to compilation-benchmark
are in https://github.com/deepfire/benchmark-compilation, which I can submit as a PR, if you are interested.
Also, i7-8550U getting ahead of i7-9750H is a god damned puzzler for me..
Maybe cooling was an issue, as is often with laptops..
And yes, 3950X only winning over the same i7-8550U laptop CPU by a very slight margin -- is also an eye opener -- the 1.5x memory latency that Zen2 has over intel is definitely an issue..
New tally, for --iterations 1
runs that are added on openbenchmarking (https://openbenchmarking.org/test/pts/build-clash-1.0.0) -- with the old --iterations 3
results rescaled (and marked with strike-through) for comparability with the new timings -- sorted by clash
timing, where available, otherwise by gradle
:
CPU | ghc -j | phys cores | base, GHz | max, GHz | L3, MB | clash | Java Gradle |
---|---|---|---|---|---|---|---|
10900K | 20 | 10 | 3.7 | 5.3 | 20 | 294 | 188 |
9900KS | 16 | 8 | 4.0 | 5.0 | 16 | 193 | |
9900K | 16 | 8 | 3.6 | 5.0 | 16 | 321 | |
3300X | 8 | 4 | 3.8 | 4.3 | 16 | 354 | |
10980XE | 36 | 18 | 3.0 | 4.6 | 24.75 | 247 | |
3950X | 32 | 15 | 3.5 | 4.7 | 64 | ~363~ | 251 |
3900XT | 24 | 12 | 3.8 | 4.7 | 64 | 364 | 251 |
3700X | 16 | 8 | 3.6 | 4.4 | 32 | 367 | |
3900X | 24 | 12 | 3.8 | 4.6 | 64 | 252 | |
8550U | 8 | 4 | 1.8 | 4.0 | 8 | ~369~ | |
2500K | 4 | 4 | 3.3 | 3.7 | 6 | 375 | |
8265U | 8 | 4 | 1.6 | 3.9 | 6 | ~375~ | 309 |
3960X | 48 | 24 | 3.8 | 4.5 | 128 | 382 | |
3200U | 4 | 2 | 2.6 | 3.5 | 4 | 327 | |
3990X | 128 | 64 | 2.9 | 4.3 | 256 | 413 | |
1065G7 | 8 | 4 | 1.3 | 3.9 | 8 | ~429~ | 362 |
9750H | 12 | 6 | 2.6 | 4.5 | 12 | 429 | 220 |
5600U | 4 | 2 | 2.6 | 3.2 | 4 | 451 | 367 |
4500U | 6 | 6 | 2.3 | 4.0 | 8 | 475 | 217 |
4700U | 8 | 8 | 2.0 | 4.1 | 8 | 224 | |
3770 | 8 | 4 | 3.4 | 3.9 | 8 | 534 | |
2700K | 8 | 4 | 3.5 | 3.9 | 8 | 297 |
Also, included Java Gradle build timings from openbenchmarking, as it's also compilation by a highly-optimised GC-based compiler.
UPDATE: added quite a bunch of new CPU results from openbenchmarking.
Thanks for putting in all this effort in getting the benchmark into openbenchmarking! Would really welcome the PR. Also gonna run this new script on our machine.
I do wonder if we should see whether we can use those "optimised" RTS settings qn8 -A32M
, since the default setting really penalizes the high-thread/core count CPUs, while it doesn't negatively affects the low-thread/core count CPUs.
I'll add a flag to use the optimised (as well as custom) RTS opts..
Zen2 CPUs, such as Ryzen 3900x and 3950x (as well as the newer Intel offerings, such as i9-10900K) made the published comparison severely out of date.
It would have been rather quite nice to have a run of the same benchmarks against then newer options!
There is also quite some intrigue, since:
It's also worth noting, that GHC builds scale quite poorly to higher core counts.