llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.31k stars 11.69k forks source link

[clang][regression] Cannot build the boost library for AArch64 Linux after 20d497c26fc95c80a1bacb38820d92e5f52bec58 #101525

Open pawosm-arm opened 2 months ago

pawosm-arm commented 2 months ago

The boost library, even the most recent sources, has the clang.jam file which computes a target triple as such:

local rule init-flags-cross ( toolset : condition * : architecture + : address-model + : target-os )
{
    local vendor = unknown ;
    local sys = unknown ;
    switch $(target-os)
    {
        case darwin : vendor = apple ; sys = darwin ;
        case linux : vendor = pc ; sys = linux ;
    }
    local vendor-sys = $(vendor)-$(sys) ;
    for local _architecture_ in $(architecture)
    {
        for local _address-model_ in $(address-model)
        {
            local arch = unknown ;
            switch $(_architecture_)-$(_address-model_)
            {
                case arm-64 : arch = arm64 ;
                case arm-32 : arch = arm ;
                case x86-64 : arch = x86_64 ;
                case x86-32 : arch = i386 ;
            }

            toolset.flags $(toolset)
                OPTIONS $(condition)/<target-os>$(target-os)/<architecture>$(_architecture_)/<address-model>$(_address-model_)
                : "--target=$(arch)-$(vendor-sys)"
                : unchecked ;
        }
    }
}

In effect, for AArch64 Linux the target triple is arm64-pc-linux. With a target triple like that, after 20d497c26fc95c80a1bacb38820d92e5f52bec58 commit (as pointed at by bisect), the C++ compiler cannot find the standard library headers anymore (at least when they're provided by the GCC's libstdc++).

Consider example file, std.cc:

#include <cstddef>

Try to compile it the way bjam would do it when building boost:

$ clang -c -x c++ -fvisibility-inlines-hidden -fPIC -pthread -O3 -Wall -fvisibility=hidden -Wno-inline -std=c++11 -mcpu=native -ffp-contract=fast -Wno-error=enum-constexpr-conversion -DBOOST_ALL_NO_LIB=1 -DBOOST_MPI_DYN_LINK=1 -DBOOST_MPI_PYTHON_DYN_LINK=1 -DBOOST_PYTHON_DYN_LINK=1 -DNDEBUG -I"." -I"/usr/include/python3.10" std.cc --target=arm64-pc-linux
std.cc:1:10: fatal error: 'cstddef' file not found
    1 | #include <cstddef>
      |          ^~~~~~~~~
1 error generated.

(the same would happen if aarch64-pc-linux triple is used)

Now, replace the arm64-pc-linux target triple with more appropriate aarch64-linux-gnu:

$ clang -c -x c++ -fvisibility-inlines-hidden -fPIC -pthread -O3 -Wall -fvisibility=hidden -Wno-inline -std=c++11 -mcpu=native -ffp-contract=fast -Wno-error=enum-constexpr-conversion -DBOOST_ALL_NO_LIB=1 -DBOOST_MPI_DYN_LINK=1 -DBOOST_MPI_PYTHON_DYN_LINK=1 -DBOOST_PYTHON_DYN_LINK=1 -DNDEBUG -I"." -I"/usr/include/python3.10" std.cc --target=aarch64-linux-gnu
$ file std.o
std.o: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped

(NB, using arm64-gnu-linux or arm64-linux-gnu triple also triggers the wrong behavior)

We could argue whether some triples are more sane than others, but sadly, boost is too popular and widely used to ignore its preference.

AaronBallman commented 2 months ago

CC @MaskRay

MaskRay commented 2 months ago

The boost library should be fixed. It the intention is to use system GCC, specify --target=$(gcc -dumpmachine) instead of making up possibly-invalid triples like the snippet shows.

"pc" as vendor seems only used by i386/x86-64 (per config.guess). The conventional target triple for Linux AArch64 is aarch64-unknown-linux-gnu. BTW, the canonical arch part is "aarch64" instead of "arm64" for Linux and most(?) BSD.

clangDriver used to have very loose behaviors for target triple. Specify aarch64-pc-linux and it would probe aarch64-unknown-linux-gnu GCC installations. Over the previous several years, the CollectLibDirsAndTriples hacks have been gradually tightened.

asl commented 2 months ago

BTW, the canonical arch part is "aarch64" instead of "arm64" for Linux and most(?) BSD.

I believe arm64 is apple-specific

AaronBallman commented 2 months ago

I agree that boost should be fixed, but boost is also a very popular third-party library and not being able to build it for a particular platform a potentially serious problem. So the issue is invalid in a way, but should we work around it just the same as we do for other important third-party system headers?

pawosm-arm commented 2 months ago

I agree that boost should be fixed, but boost is also a very popular third-party library and not being able to build it for a particular platform a potentially serious problem. So the issue is invalid in a way, but should we work around it just the same as we do for other important third-party system headers?

I fully agree, but I don't know how such situations were handled in the past. I expect complain-storm when LLVM19 is out.

MaskRay commented 2 months ago

Several pronounced aarch64 Linux target triples were missing and the clang.jam file of Boost hasn't ever been working for at least these aarch64 Linux target triples: aarch64-unknown-linux-gnu (Most non-Debian-derivative Linux), aarch64-unknown-linux-musl (musl-based Linux distros), aarch64-amazon-linux (including other vendors that customize the triple).

clang.jam did work with Debian/Ubuntu (aarch64-linux-gnu).

I think it's critical to ensure Boost folks are made aware and fix clang.jam. If the workaround is important, then release/19.x only #102039 can be merged.