[NativeAOT-LLVM] Initial version of a multi-threaded compiler

SingleAccretion commented 1 year ago

This change implements the multi-threading scheme described in #2293.

We partition the incoming into 8 modules, regardless of the --parallelism setting, which achieves the desired determinism characteristics. These modules are compiled on multiple threads both when generating LLVM and when generating machine code (i. e. invoking clang).

The peformance results of this work can be summarized as "modest", though some gains, especially in Release builds, are quite visible (up to 2x speedups on my machine). For yet unclear reasons, parallelism inside ILC itself is not fully utilized, however, that is also not the bottleneck of the build, as especially in Debug a lot of time is spent in wasm-esmcripten-finalize.exe. Time is also spent with the disk interactions - ideally the next step would be to move the Clang part of the build in-process (or do something like thin-LTO).

I have put together a small sheet of the results as they apply to HelloWasm on my machine:

Module count	1	2	3	4	5	6	7	8	NAOT ILC / 8
Total compilation time (Release)	01:03.3	42.0	37.8	33.7	33.1	32.8	30.9	30.3	26.6
	01:04.7	41.9	39.2	33.1	32.5	32.9	31.6	30.1	27.6

Bitcode compilation time (Release)	6.66	5.36	5.04	4.75	4.58	4.57	4.45	4.55	3.84
	7.72	5.34	4.84	4.78	5	4.6	4.46	4.53	3.71

WASM file size	4.672	4.537	4.5	4.483	4.467	4.451	4.457	4.455

Total compilation time (Debug)	1:05.9	53.4	51.8	49.6				49.3	46.6
	1:04.9	53.1	52.9						44.7

Bitcode compilation time (Debug)	13.65	11.05	10.72	10.14				9.92	5.62
	13.82	11.12	10.65						5.31

Note that the measurement were done using a CoreCLR-based ILC (except for the NAOT / 8 column).

SingleAccretion commented 1 year ago

@dotnet/nativeaot-llvm

SingleAccretion commented 1 year ago

Let me add a test first here.

SingleAccretion commented 1 year ago

@dotnet/nativeaot-llvm

dotnet / runtimelab

[NativeAOT-LLVM] Initial version of a multi-threaded compiler #2297