This is a recipe for reproducibly building a LLVM/Clang/LLD based mingw-w64 toolchain.
Benefits of a LLVM based MinGW toolchain are:
-mguard=cf
compile and link flags)Clang on its own can also be used as compiler in the normal GNU binutils based environments though, so the main difference lies in replacing binutils with LLVM based tools.
The GitHub Releases page contains prebuilt toolchains that can be downloaded and installed by just unpacking them.
They come primarily in two different forms; packages named
llvm-mingw-<version>-<crt>-ubuntu-<distro_version>-<arch>.tar.xz
are cross compilers, that can be run on Linux, compiling binaries
for any of the 4 target Windows architectures. Packages named
llvm-mingw-<version>-<crt>-<arch>.zip
are native toolchains that
run on Windows (with binaries in the specified architecture), but
which all can compile binaries for any of the 4 architectures.
The cross compilers come in versions running on either x86_64 or aarch64. (They're built on Ubuntu, but hopefully do run on other contempory distributions as well.)
There are packages with two different choices of CRT (C runtime) - the
primary target is UCRT (the Universal C Runtime). The UCRT
is available preinstalled since Windows 10, but can be
installed
on top of Vista or newer. The other legacy alternative is msvcrt
,
which produces binaries for (and uses) msvcrt.dll, which is a
built-in component in all versions of Windows. This allows running
directly out of the box on older versions of Windows too, without
ensuring that the UCRT is installed, but msvcrt.dll is generally
less featureful. Address Sanitizer only works properly with UCRT.
In addition to the downloadable toolchain packges, there are also prebuilt docker linux images containing the llvm-mingw toolchain, available from Docker Hub.
There are also nightly builds with the very latest versions of LLVM and mingw-w64 from git.
The toolchain can be compiled for installation in the current Unix environment, fetching sources as needed:
./build-all.sh <target-dir>
It can also be built, reproducibly, into a Docker image:
docker build .
Individual components of the toolchain can be (re)built by running
the standalone shellscripts listed within build-all.sh
. However, if
the source already is checked out, no effort is made to check out a
different version (if the build scripts have been updated to prefer
a different version) - and likewise, if configure flags in the build-*.sh
scripts have changed, you might need to wipe the build directory under
each project for the new configure options to be taken into use.
To build in MSYS2, install the following set of packages with pacman -S --needed
:
git wget mingw-w64-x86_64-gcc mingw-w64-x86_64-ninja mingw-w64-x86_64-cmake make mingw-w64-x86_64-python3 autoconf libtool
The toolchain currently does support both C and C++, including support for exception handling.
LLD, the LLVM linker, is what causes most of the major differences to the normal GCC/binutils based MinGW.
build-mingw-w64.sh
though. The Universal
CRT is only available out of the box since Windows 10, but can be
installed
on Vista or newer. For x86, there are also releases that run on
msvcrt.dll.___chkstk_ms
,
__alloca
or ___divdi3
.)
For such targets, libtool tries to detect which libraries to link
by invoking the compiler with $CC -v
and picking up the libraries that
are linked by default, and then invoking the linker driver with -nostdlib
and specifying the default libraries manually. In doing so, libtool fails
to detect when clang is using compiler_rt instead of libgcc, because
clang refers to it as an absolute path to a static library, instead of
specifying a library path with -L
and linking the library with -l
.
Clang is reluctant to changing this behaviour.
A bug has been filed
with libtool, but no fix has been committed, and as libtool files are
shipped with the projects that use them (bundled within the configure
script), one has to update the configure script in each project to avoid
the issue. This can either be done by installing libtool, patching it
and running autoreconf -fi
in the project, or by manually applying the
fix on the shipped configure
script. A patched version of libtool is
shipped in MSYS2
at least.lld-link: error: .libs\libfoobar.la.lnkscript: unknown file type
.
To fix this, the bundled libtool scripts has to be fixed like explained
above, but this fix requires changes both to configure
and a separate
file named ltmain.{in,sh}
. A fix for this is also
shipped in MSYS2.Additionally, one may run into other minor differences between GCC and clang.
LLVM does support generating debug info in the PDB format. Since GNU binutils based mingw environments don't support this, there's no predecent for what command line parameters to use for this, and llvm-mingw produces debug info in DWARF format by default.
To produce debug info in PDB format, you currently need to do the following changes:
-gcodeview
to the compilation commands (e.g. in
wrappers/clang-target-wrapper.sh
), together with using -g
as usual to
enable debug info in general.-Wl,--pdb=
to linking commands. This creates a PDB file at the same
location as the output EXE/DLL, but with a PDB extension. (By passing
-Wl,--pdb=module.pdb
one can explicitly specify the name of the output
PDB file.)Even though LLVM supports this, there are some caveats with using it when building in MinGW mode; Microsoft debuggers might have assumptions about the C++ ABI used, which doesn't hold up with the Itanium ABI used in MinGW.