crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.2k stars 1.61k forks source link

Compiler: enable parallel codegen with MT #14748

Open ysbaddaden opened 6 days ago

ysbaddaden commented 6 days ago

Implements parallel codegen of object files when MT is enabled in the compiler (-Dpreview_mt).

It only impacts codegen for compilations with more than one compilation unit (module), that is when neither of --single-module, --release or --cross-compile is specified. This behavior is identical to the fork based codegen.

Advantages:

The main points are increased portability and simpler logic, despite having to take care of LLVM thread safety quirks (see comments).

Issues:

  1. The threads arg actually depicts the number of fibers, not threads, which is confusing and problematic: increasing threads but not CRYSTAL_WORKERS will lead to more fibers than threads, with fibers being sheduled on the same threads, which won't bring any improvement.

    In fact CRYSTAL_WORKERS defaults to 4, when threads defaulted to 8. With this patch it defaults to CRYSTAL_WORKERS, so MT can end up being slower if we don't specify CRYSTAL_WORKERS=8.

  2. This is still not as efficient as it could be. The main fiber (that feeds the worker fibers) can get blocked by a worker fiber doing codegen, leading the other workers to starve. This is easily noticeable when compiling with -O1 for example.

Both issues will be fixable with RFC 2 where we can start an explicit context to run the worker fibers or start N isolated contexts (maybe a better idea). Until then, one should increase CRYSTAL_WORKERS.

Supersedes #14227 and doesn't segfault (so far) with LLVM 18.1 :crossed_fingers:

TODO:

straight-shoota commented 5 days ago

This looks great. But it also seems to be a mix of different changes. Could we extract the independent refactorings (such as extracting sequential_codegen and fork_codegen, memoization of some methods) to their own PRs?