crystal-lang / distribution-scripts

40 stars 24 forks source link

Upgrade to LLVM 13 #147

Closed straight-shoota closed 1 year ago

straight-shoota commented 2 years ago

The latest LLVM version 13.0.0 has been released yesterday. It includes relevant fixes we've been waiting for.

A regression in LLVM 11 prevented us from upgrading to LLVM 11 or 12 (https://github.com/crystal-lang/crystal/issues/10359). The fix should be available with LLVM 13. It has been very recently discovered that LLVM 13 should also have a fix for https://github.com/crystal-lang/crystal/issues/11047 (although that's less relevant here because we don't ship Windows builds yet).

A challenge for that is that LLVM 13 is not yet available in aports edge and I'm not sure when it will be. I don't think it would be backported to Alpine 3.12, which we're still stuck with for now (https://github.com/crystal-lang/distribution-scripts/pull/127, https://github.com/crystal-lang/crystal/issues/10366). We still don't know the reasons for the issues experienced on Alpine > 3.12.

We can either pull in the llvm-13 packages from Alpine edge whenever they are available. Or we need to build LLVM 13 manually.

If we want to link the upcoming Crystal 1.2 builds with LLVM 13, we'll probably have to resort to the latter. We want to ship them next week.

This is obviously more effort. So maybe it's not worth it considering that we can stay with LLVM 10 for now to avoid https://github.com/crystal-lang/crystal/issues/10359. The most relevant fix in LLVM 13 at this point is really for Windows and we can already use that in CI.

Requires https://github.com/crystal-lang/crystal/issues/11277

kostya commented 2 years ago

from discord: I experiencing that crystal with llvm13 compile things much faster:

crsytal from releases page:

$ rm -rf ~/.cache/crystal
$ crystal -v
Crystal 1.4.0 [ef05e26d6] (2022-04-06)

LLVM: 10.0.0
Default target: x86_64-unknown-linux-gnu

$ time crystal build bench.cr --release

real    0m15,775s
user    0m15,755s
sys     0m0,144s

my

$ rm -rf ~/.cache/crystal
$ ~/projects//crystal/bin/crystal -v
Using compiled compiler at /home/kostya/projects/crystal/.build/crystal
Crystal 1.4.0 [ef05e26d6] (2022-04-06)

LLVM: 13.0.0
Default target: x86_64-pc-linux-gnu

$ time ~/projects/crystal/bin/crystal build bench.cr --release
Using compiled compiler at /home/kostya/projects/crystal/.build/crystal

real    0m7,564s
user    0m7,541s
sys     0m0,109s

in stats Codegen (bc+obj) just 2 times faster, everything else similar.

Also compile compiler in release mode: llvm10 - 12min, llvm13 - 5.4min

straight-shoota commented 2 years ago

LLVM 13 has just been added to Alpine edge. So there's some progress.

I'm quite surprised about this performance improvement. Not sure what's causing this. I would not expect such a huge effect from upgrading LLVM from 10 to 13.

And frankly, I can't reproduce it on my system. I see roughly the same build time regardless of LLVM version used in the compiler. Can you share some system information?

kostya commented 2 years ago

ryzen 3800x, Ubuntu 21.10 in VirtualBox on Windows (performance is the same as on pure ubuntu, because of amd-v, checked many times before). Did you use version downloaded from github releases page? May be there is some not release build things?

kostya commented 2 years ago

Just tested again, downloaded this https://github.com/crystal-lang/crystal/releases/download/1.4.0/crystal-1.4.0-1-linux-x86_64.tar.gz, and put into PATH.

cd crystal-source at tag 1.4.0

$ rm -rf ~/.cache/crystal/
$ ./bin/crystal -v
Crystal 1.4.0 [ef05e26d6] (2022-04-06)

LLVM: 10.0.0
Default target: x86_64-unknown-linux-gnu

$ time make clean crystal release=1 stats=1
rm -rf .build
rm -rf ./docs
rm -rf src/llvm/ext/llvm_ext.o
Using /usr/bin/llvm-config-13 [version= 13.0.0]
g++ -c  -o src/llvm/ext/llvm_ext.o src/llvm/ext/llvm_ext.cc -I/usr/lib/llvm-13/include -std=c++14   -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
CRYSTAL_CONFIG_BUILD_COMMIT="ef05e26d6" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1649265695" CRYSTAL_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign --release --stats -Dwithout_interpreter  -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Parse:                             00:00:00.000043031 (   0.76MB)
Semantic (top level):              00:00:00.459027825 ( 122.93MB)
Semantic (new):                    00:00:00.002437981 ( 122.93MB)
Semantic (type declarations):      00:00:00.043583599 ( 138.93MB)
Semantic (abstract def check):     00:00:00.016040574 ( 154.93MB)
Semantic (ivars initializers):     00:00:07.218215007 (1106.86MB)
Semantic (cvars initializers):     00:00:00.012205772 (1106.86MB)
Semantic (main):                   00:00:12.133006871 (1282.86MB)
Semantic (cleanup):                00:00:00.000696819 (1282.86MB)
Semantic (recursive struct check): 00:00:00.001659750 (1282.86MB)
Codegen (crystal):                 00:00:07.200266114 (1546.86MB)
Codegen (bc+obj):                  00:10:04.382928159 (1546.86MB)
Codegen (linking):                 00:00:00.974312303 (1546.86MB)

Macro runs:
 - /home/kostya/projects/crystal/src/ecr/process.cr: 00:00:10.691528949

Codegen (bc+obj):
 - no previous .o files were reused

real    10m35,210s
user    10m33,735s
sys     0m4,737s

after

$ rm -rf ~/.cache/crystal/
$ ./bin/crystal -v
Using compiled compiler at .build/crystal
Crystal 1.4.0 [ef05e26d6] (2022-04-06)

LLVM: 13.0.0
Default target: x86_64-pc-linux-gnu

$ time CRYSTAL_CONFIG_BUILD_COMMIT="ef05e26d6" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1649265695" CRYSTAL_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign --release --stats -Dwithout_interpreter  -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse:                             00:00:00.000047289 (   0.76MB)
Semantic (top level):              00:00:00.296839306 ( 123.99MB)
Semantic (new):                    00:00:00.002100901 ( 123.99MB)
Semantic (type declarations):      00:00:00.042825347 ( 139.99MB)
Semantic (abstract def check):     00:00:00.013856054 ( 155.99MB)
Semantic (ivars initializers):     00:00:05.212381241 (1107.92MB)
Semantic (cvars initializers):     00:00:00.010233332 (1107.92MB)
Semantic (main):                   00:00:06.569793776 (1283.92MB)
Semantic (cleanup):                00:00:00.000698519 (1283.92MB)
Semantic (recursive struct check): 00:00:00.001459019 (1283.92MB)
Codegen (crystal):                 00:00:04.946671101 (1547.92MB)
Codegen (bc+obj):                  00:05:23.528932942 (1547.92MB)
Codegen (linking):                 00:00:01.003124368 (1547.92MB)

Macro runs:
 - /home/kostya/projects/crystal/src/ecr/process.cr: 00:00:05.252160888

Codegen (bc+obj):
 - no previous .o files were reused

real    5m41,763s
user    5m41,447s
sys     0m2,798s
ysbaddaden commented 2 years ago

Reproduced: Intel Haswell laptop, ElementaryOS 5 (based on Ubuntu 18.04, Linux 5.4, glibc 2.27), LLVM 13.0.1 installed from LLVM's APT repository. I'm compiling Crystal 1.4.1.

With Crystal 1.4.1 official DEB package:

$ bin/crystal  --version
Crystal 1.4.1 [b7377c041] (2022-04-22)

LLVM: 10.0.0
Default target: x86_64-unknown-linux-gnu

$ make clean crystal release=1 stats=1
rm -rf .build
rm -rf ./docs
rm -rf src/llvm/ext/llvm_ext.o
Using /usr/bin/llvm-config-13 [version= 13.0.1]
g++ -c  -o src/llvm/ext/llvm_ext.o src/llvm/ext/llvm_ext.cc -I/usr/lib/llvm-13/include -std=c++14   -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
CRYSTAL_CONFIG_BUILD_COMMIT="b7377c041" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1650653870" CRYSTAL_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign --release --stats -Dwithout_interpreter  -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Parse:                             00:00:00.000059032 (   0.76MB)
Semantic (top level):              00:00:00.779523308 ( 123.14MB)
Semantic (new):                    00:00:00.004029681 ( 123.14MB)
Semantic (type declarations):      00:00:00.085297779 ( 139.14MB)
Semantic (abstract def check):     00:00:00.026972275 ( 155.14MB)
Semantic (ivars initializers):     00:00:12.114145923 (1107.07MB)
Semantic (cvars initializers):     00:00:00.019897106 (1107.07MB)
Semantic (main):                   00:00:15.740714243 (1283.07MB)
Semantic (cleanup):                00:00:00.001150210 (1283.07MB)
Semantic (recursive struct check): 00:00:00.002728421 (1283.07MB)
Codegen (crystal):                 00:00:10.825581444 (1547.07MB)
Codegen (bc+obj):                  00:16:34.960303616 (1547.07MB)
Codegen (linking):                 00:00:02.053159918 (1547.07MB)

Macro runs:
 - /home/github/crystal/src/ecr/process.cr: 00:00:13.788826770

 Codegen (bc+obj):
  - no previous .o files were reused

With the freshly generated Crystal with LLVM 13:

$ CRYSTAL_CONFIG_BUILD_COMMIT="b7377c041" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1650653870" CRYSTAL_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign --release --stats -Dwithout_interpreter  -o .build/crystal src/compiler/crystal.cr -D without_openssl -D without_zlib
Using compiled compiler at .build/crystal
Parse:                             00:00:00.000126235 (   0.76MB)
Semantic (top level):              00:00:00.459376204 ( 123.14MB)
Semantic (new):                    00:00:00.003528789 ( 123.14MB)
Semantic (type declarations):      00:00:00.084022735 ( 139.14MB)
Semantic (abstract def check):     00:00:00.030140651 ( 155.14MB)
Semantic (ivars initializers):     00:00:07.994777631 (1107.07MB)
Semantic (cvars initializers):     00:00:00.023465840 (1107.07MB)
Semantic (main):                   00:00:09.633619318 (1283.07MB)
Semantic (cleanup):                00:00:00.001162493 (1283.07MB)
Semantic (recursive struct check): 00:00:00.002510932 (1283.07MB)
Codegen (crystal):                 00:00:07.721239395 (1547.07MB)
Codegen (bc+obj):                  00:08:20.788693715 (1547.07MB)
Codegen (linking):                 00:00:01.815475772 (1547.07MB)

Macro runs:
 - /home/github/crystal/src/ecr/process.cr: 00:00:07.561613304

Codegen (bc+obj):
 - no previous .o files were reused

This is a dramatic improvement. From 17+ minutes down to 9 minutes on my Haswell laptop.

Given that it impacts all stages, it's not directly related to LLVM. I heard that musl-libc was meant to be correct rather than fast, but still, that's too huge of a difference. Maybe something to do with memory allocations? I think it linked a locally installed BDWGC 8.1.0 library.

kostya commented 2 years ago

on MacBook M1, crystal installed from homebrew:

Crystal 1.4.0 (2022-04-06)

LLVM: 13.0.1
Default target: aarch64-apple-darwin20.6.0

both compile and recompile release compiler takes 3.3min

straight-shoota commented 2 years ago

@kostya Did you recompile on M1 with LLVM 13 (which the compiler from homebrew already uses), or with LLVM 10?

kostya commented 2 years ago

Both used llvm13. also I not see llvm10 in homebrew, only 7,8,9,11,12,13

ysbaddaden commented 2 years ago

Reproduced on compiling other Crystal programs. For example a testsuite I tried goes from 30s to 18s with an empty cache (6.5s to 4.2s with a full cache).

Having a boost with a Crystal compiler generated using LLVM 10 seem to imply that this has nothing to do with LLVM 13. We can probably reproduce the same by recompiling Crystal with LLVM 10.

It sounds more like a difference between Alpine/musl and Ubuntu/glibc, a library, or maybe the way the LLVM library was compiled, and what features are enabled by default? Or maybe with libgc or underlying musl vs glibc, or (why not) spectre mitigations?

kostya commented 2 years ago

i compile this on linux, with different crystal versions:

require "json"

struct A
  include JSON::Serializable
  getter a : Int32
end

p A.from_json(%Q<{"a":1}>)
$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-0.25.1-1/bin/crystal build 1.cr --release

real    0m3,870s
user    0m3,842s
sys     0m0,076s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-0.30.0-1/bin/crystal build 1.cr --release

real    0m3,654s
user    0m3,683s
sys     0m0,043s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-0.31.0-1/bin/crystal build 1.cr --release

real    0m7,401s
user    0m7,413s
sys     0m0,058s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-0.32.0-1/bin/crystal build 1.cr --release

real    0m7,535s
user    0m7,519s
sys     0m0,091s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-0.34.0-1/bin/crystal build 1.cr --release

real    0m7,569s
user    0m7,550s
sys     0m0,095s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-1.0.0-1/bin/crystal build 1.cr --release

real    0m10,458s
user    0m10,431s
sys     0m0,106s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-1.2.2-1/bin/crystal build 1.cr --release

real    0m10,823s
user    0m10,643s
sys     0m0,110s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-1.3.2-1/bin/crystal build 1.cr --release

real    0m12,721s
user    0m12,608s
sys     0m0,103s

$ rm -rf ~/.cache/crystal/ && time ~/Downloads/crystal-1.4.0-1/bin/crystal build 1.cr --release

real    0m13,276s
user    0m13,277s
sys     0m0,127s

$ rm -rf ~/.cache/crystal/ && time ~/projects/crystal/bin/crystal build 1.cr --release
Using compiled compiler at .build/crystal

real    0m6,367s
user    0m6,350s
sys     0m0,101s

last version is 1.4.0 recompiled. Funny that current compiler 4 times slower than 0.30.0 was

straight-shoota commented 2 years ago

Let's continue the discussion about performance potentials in https://github.com/crystal-lang/crystal/issues/12060