llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.05k stars 11.58k forks source link

Can't reopen pull/86737 / work out why arm bot fails building sanitizer if builtins is fixed #87206

Open JonChesterfield opened 5 months ago

JonChesterfield commented 5 months ago

Changing compiler-rt builtins to succeed without a libc (https://github.com/llvm/llvm-project/pull/86737) breaks the arm bot (https://lab.llvm.org/buildbot/#/builders/178/builds/7117) while trying to build sanitizers,

/home/tcwg-buildbot/worker/clang-armv8-lld-2stage/stage1/./bin/clang++ --target=armv8l-unknown-linux-gnueabihf -DHAVE_RPC_XDR_H=1 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GLIBCXX_ASSERTIONS -D_LARGEFILE_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/.. -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -Wno-unused-parameter -O3 -DNDEBUG -march=armv7-a -mfloat-abi=soft -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -ftrivial-auto-var-init=pattern -nostdinc++ -Wno-format -fno-rtti -Wframe-larger-than=570 -Wglobal-constructors -std=c++17 -MD -MT compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommon.arm.dir/sanitizer_deadlock_detector2.cpp.o -MF compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommon.arm.dir/sanitizer_deadlock_detector2.cpp.o.d -o compiler-rt/lib/sanitizer_common/CMakeFiles/RTSanitizerCommon.arm.dir/sanitizer_deadlock_detector2.cpp.o -c /home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector2.cpp
In file included from /home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector2.cpp:13:
In file included from /home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_deadlock_detector_interface.h:22:
In file included from /home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h:15:
In file included from /home/tcwg-buildbot/worker/clang-armv8-lld-2stage/llvm/compiler-rt/lib/sanitizer_common/sanitizer_platform.h:25:
In file included from /usr/include/features.h:485:
/usr/include/arm-linux-gnueabihf/gnu/stubs.h:7:11: fatal error: 'gnu/stubs-soft.h' file not found
    7 | # include <gnu/stubs-soft.h>
      |           ^~~~~~~~~~~~~~~~~~

Various other files fail similarly. My best guess is the bot doesn't have a sysroot set up at this point, which causes the compiler-rt builtins build to fail, and somehow that's considered a success and a reason to skip over building the sanitizers, which is presumably fine since the bot goes green and the sanitizers wouldn't have a gnu/stubs-soft.h header as a result of tolerating a missing libc.

This makes very little sense to me so I assume the error lies in cmake somewhere, unfortunately I can't find the invocation driving the failing CI build.

Creating this issue because I can't reopen a PR (?) and otherwise I'm likely to forget about it. @luporl I'm hoping you have some more information on this.

luporl commented 5 months ago

@JonChesterfield, my understanding is that the bot has an armhf sysroot installed, but, with #86737, we try to build sanitizers for targets using soft-float, as seen in the command line above, which has -mfloat-abi=soft in it. The problem is that this requires soft-float headers such as gnu/stubs-soft.h, that are absent in a hard-float sysroot. Without #86737, -mfloat-abi=hard is used instead and the sanitizers are built successfully.

Another thing that called my attention is that, with #86737, there are many more Compiler-RT supported architectures: Compiler-RT supported architectures: armv4t;armv6m;arm;armhf while without it, we have only armhf: Compiler-RT supported architectures: armhf

So it seems to me that, with the new test program in #86737, the build system is wrongly deducing that an armhf sysroot can be used to build soft-float binaries.

JonChesterfield commented 5 months ago

Interesting! The presence of <stdio.h> is a very indirect test for soft float. Maybe the abi is encoded in the name of the sysroot and only one of the four strings tried hits a sysroot that exists.

config.ix.cmake

set(COMPILER_RT_SUPPORTED_ARCH) 
... loads of cmake
message(STATUS "Compiler-RT supported architectures: ${COMPILER_RT_SUPPORTED_ARCH}") 

There's call stack involved that sort of makes sense. test_targets() is in base-confg-ix.cmake. That branches on a platform/target sort of string. This looks like your branch:

test_target_arch(armv4t "" "-march=armv4t" "-mfloat-abi=soft")
test_target_arch(armv6m "" "-march=armv6m" "-mfloat-abi=soft")
test_target_arch(arm "" "-march=armv7-a" "-mfloat-abi=soft")
test_target_arch(armhf "" "-march=armv7-a" "-mfloat-abi=hard")

test_target_arch is in CompilerRTUtils.cmake. That sets a variable like CAN_TARGET_armv4t and then conditionally appends that to COMPILER_RT_SUPPORTED_ARCH.

There's an inline source file using limits.h here which should probably be at the top level alongside the SIMPLE_SOURCE that looks for printf, used for TEST_COMPILE_ONLY (?, haven't found that yet).

What's the proper way to work out if compiler-rt supports a given arm dialect?

If the builtins can be compiled without a pre-existing sysroot (which seems like a good idea, I believe compiler-rt builtins is the implementation of soft float) then the answer probably lies in the very branchy code about different types of sanitizer in the top level CMakeLists.txt. I'm on shaky ground there though.

Is the bot trying to build the sanitizers for armhf and thus doesn't want to switch off building the sanitizers, and while it doesn't mind building builtins for extra architectures that isn't particularly a feature?

luporl commented 5 months ago

What's the proper way to work out if compiler-rt supports a given arm dialect?

I'm not that familiar with compiler-rt, but the previous behavior of supporting only the architectures for which the used toolchain was able to build compiler-rt successfully seemed a good approach. The bot error was being caused by sanitizer sources including "features.h", from libc. Maybe the builtins can be built without libc, but it seems the sanitizers need it.

Is the bot trying to build the sanitizers for armhf and thus doesn't want to switch off building the sanitizers, and while it doesn't mind building builtins for extra architectures that isn't particularly a feature?

Given the bot's CMake invocation, that has -DCOMPILER_RT_BUILD_SANITIZERS=OFF in it, I wasn't expecting it to try to build the sanitizers. I haven't investigated why this is happening. Is it missing another needed flag here? AFAIK, this bot cares about building compiler-rt for armhf only.

luporl commented 5 months ago

Even with -DCOMPILER_RT_BUILD_SANITIZERS=OFF, sanitizer common is built. Adding -DCOMPILER_RT_BUILD_XRAY=OFF -DCOMPILER_RT_BUILD_MEMPROF=OFF to CMake, sanitizer common build is skipped, but then the same failure occurs during fuzzer build.

It is possible to build compiler-rt with #86737 only by also disabling memprof, libfuzzer, orc and profile: -DCOMPILER_RT_BUILD_LIBFUZZER=OFF -DCOMPILER_RT_BUILD_ORC=OFF -DCOMPILER_RT_BUILD_PROFILE=OFF. This doesn't seem reasonable. Requiring 2 ARM toolchains to be installed, forcing the support of both hard-float and soft-float ABIs, also doesn't. IMHO, if compiler-rt needs libc for a given target, it must check it.

JonChesterfield commented 5 months ago

The thing is compiler-rt builtins doesn't need a libc. In particular, failing to build builtins because there is no printf when builtins doesn't call printf is unfortunate. Fuzzer/orc etc probably need a lot of things.

Builtins needs a compiler, but not a linker, so even int main() {return 0;} is asking for more than it requires.

How about only running the "does it have printf" test when trying to build any of the subprojects in compiler-rt other than the builtins? I think that would work for the arm toolchain buildbot case and it'll solve my bootstrapping problem.

luporl commented 5 months ago

How about only running the "does it have printf" test when trying to build any of the subprojects in compiler-rt other than the builtins? I think that would work for the arm toolchain buildbot case and it'll solve my bootstrapping problem.

This seems like a good solution. I was able to build compiler-rt with armhf toolchain only, with #86737 applied, when only the builtins were enabled.

JonChesterfield commented 4 months ago

I wonder if detecting cmake's builtin C_COMPILER_WORKS might be the right way to go. That tends to get set to =1 when trying to do anything in cross compilation as the platform detection logic within cmake itself is otherwise quite obstructive. Though the buildbot might have that set.

Reorganising the cmake to skip the is-there-printf for builtins should work. In general it might require people to compile builtins, and then libc, and then the rest of compiler-rt - but that's probably a good idea anyway, it's likely that the sanitizers legitimately do require libc to exist.

I'll find time to put up a patch for the reordering. Unblocked locally by adding a sed invocation that clobbers the source tree to my build script but that's nasty and error prone.