llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.8k stars 10.98k forks source link

UBSAN failures in clang constant interpreter tests on 32 bit ARM #94741

Open DavidSpickett opened 1 month ago

DavidSpickett commented 1 month ago

Doing a 32 bit Arm build with UBSAN enabled, I get many failures from the clang interpreter:

  Clang :: AST/Interp/arrays.cpp
  Clang :: AST/Interp/atomic.c
  Clang :: AST/Interp/atomic.cpp
  Clang :: AST/Interp/bitfields.cpp
  Clang :: AST/Interp/builtin-align-cxx.cpp
  Clang :: AST/Interp/builtin-functions.cpp
  Clang :: AST/Interp/builtins.cpp
  Clang :: AST/Interp/c.c
  Clang :: AST/Interp/c23.c
  Clang :: AST/Interp/comma.cpp
  Clang :: AST/Interp/complex.c
  Clang :: AST/Interp/complex.cpp
  Clang :: AST/Interp/cond.cpp
  Clang :: AST/Interp/const-eval.c
  Clang :: AST/Interp/const-fpfeatures.cpp
  Clang :: AST/Interp/constexpr-nqueens.cpp
  Clang :: AST/Interp/constexpr-subobj-initialization.cpp
  Clang :: AST/Interp/cxx03.cpp
  Clang :: AST/Interp/cxx11.cpp
  Clang :: AST/Interp/cxx17.cpp
  Clang :: AST/Interp/cxx20.cpp
  Clang :: AST/Interp/cxx23.cpp
  Clang :: AST/Interp/cxx98.cpp
  Clang :: AST/Interp/depth-limit.cpp
  Clang :: AST/Interp/enums-targets.cpp
  Clang :: AST/Interp/enums.cpp
  Clang :: AST/Interp/eval-order.cpp
  Clang :: AST/Interp/floats.cpp
  Clang :: AST/Interp/functions.cpp
  Clang :: AST/Interp/if.cpp
  Clang :: AST/Interp/intap.cpp
  Clang :: AST/Interp/invalid.cpp
  Clang :: AST/Interp/lambda.cpp
  Clang :: AST/Interp/lifetimes.cpp
  Clang :: AST/Interp/literals.cpp
  Clang :: AST/Interp/loops.cpp
  Clang :: AST/Interp/memberpointers.cpp
  Clang :: AST/Interp/nullable.cpp
  Clang :: AST/Interp/objc.mm
  Clang :: AST/Interp/opencl.cl
  Clang :: AST/Interp/pointer-addition.c
  Clang :: AST/Interp/records.cpp
  Clang :: AST/Interp/references.cpp
  Clang :: AST/Interp/shifts.cpp
  Clang :: AST/Interp/spaceship.cpp
  Clang :: AST/Interp/switch.cpp
  Clang :: AST/Interp/sycl.cpp
  Clang :: AST/Interp/unions.cpp
  Clang :: AST/Interp/vectors.cpp
  Clang :: AST/Interp/weak.cpp

Most of them are problems with reference binding or calling of constructors on misaligned addresses. Usually the type requires 8 byte alignment but the address is 4 byte aligned.

RUN: at line 1: /home/david.spickett/build-llvm-arm/bin/clang -cc1 -internal-isystem /home/david.spickett/build-llvm-arm/lib/clang/19/include -nostdsysteminc -fexperimental-new-constant-interpreter -verify=expected,both /home/david.spickett/llvm-project/clang/test/AST/Interp/arrays.cpp
+ /home/david.spickett/build-llvm-arm/bin/clang -cc1 -internal-isystem /home/david.spickett/build-llvm-arm/lib/clang/19/include -nostdsysteminc -fexperimental-new-constant-interpreter -verify=expected,both /home/david.spickett/llvm-project/clang/test/AST/Interp/arrays.cpp
/home/david.spickett/llvm-project/clang/lib/AST/Interp/InterpStack.h:36:35: runtime error: constructor call on misaligned address 0xe7cff014 for type 'clang::interp::Pointer', which requires 8 byte alignment
0xe7cff014: note: pointer points here
  3c f0 cf e7 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/david.spickett/llvm-project/clang/lib/AST/Interp/InterpStack.h:36:35 in

~This has come up a lot recently, including https://github.com/llvm/llvm-project/pull/89811 which has "uncovered" this issue outside of UBSAN builds.~

I think the assumptions in the interpreter's management of its virtual stack do not hold for 32 bit Arm. I don't know that that's the cause of all the issues we have seen without UBSAN, but it seems likely.

DavidSpickett commented 1 month ago

To reproduce by cross compiling on Linux:

Download toolchain from: https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads (https://developer.arm.com/-/media/Files/downloads/gnu/13.2.rel1/binrel/arm-gnu-toolchain-13.2.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz?rev=adb0c0238c934aeeaa12c09609c5e6fc&hash=68DA67DE12CBAD82A0FA4B75247E866155C93053).

Extract and add it to your PATH:

$ export PATH=$(pwd)/arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-linux-gnueabihf/bin/:$PATH

Set up a symlink inside the toolchain so it can find ld.lld. I've found that ld runs out of memory most of the time. Note that this symink is in the <triple>/bin folder, not bin/.

arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/bin$ ln -s $(which ld.lld-11) ./ld.lld

Then configure llvm using this script:

$ cat ../configure-arm.sh 
set -ex

cd /work/open_source/build-llvm-arm

HOST_TRIPLE=arm-none-linux-gnueabihf
C_COMPILER=$HOST_TRIPLE-gcc
CXX_COMPILER=$HOST_TRIPLE-g++

cmake -G Ninja \
  -DCMAKE_SYSTEM_NAME=Linux \
  -DCMAKE_CROSSCOMPILING=1 \
  -DCMAKE_C_COMPILER=$C_COMPILER \
  -DCMAKE_CXX_COMPILER=$CXX_COMPILER \
  -DLLVM_HOST_TRIPLE=$HOST_TRIPLE \
  -DLLVM_TARGETS_TO_BUILD=ARM \
  -DLLVM_ENABLE_PROJECTS="clang" \
  -DLLVM_ENABLE_ASSERTIONS=On \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_CXX_FLAGS="-marm -fuse-ld=lld" \
  -DCMAKE_C_FLAGS="-marm -fuse-ld=lld" \
  /work/open_source/llvm-project/llvm/

ninja then ninja check-clang-unit. It will fail to find the tests, this is expected because they're Arm32 binaries.

Now install qemu-user-static from apt and then use it to emulate the unit tests:

$ qemu-arm-static -L /work/open_source/arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/libc/  ./tools/clang/unittests/Interpreter/ClangReplInterpreterTests

Interestingly, these values are different from what we see on armv8 hardware, which makes sense if there is UB here.

[ RUN      ] InterpreterTest.Value
/work/open_source/llvm-project/clang/unittests/Interpreter/InterpreterTest.cpp:293: Failure
Expected equality of these values:
  V1.getInt()
    Which is: -12521408
  42

/work/open_source/llvm-project/clang/unittests/Interpreter/InterpreterTest.cpp:294: Failure
Expected equality of these values:
  V1.convertTo<int>()
    Which is: -12521408
  42

/work/open_source/llvm-project/clang/unittests/Interpreter/InterpreterTest.cpp:330: Failure
Expected equality of these values:
  V4.getInt()
    Which is: 124447888
  42

/work/open_source/llvm-project/clang/unittests/Interpreter/InterpreterTest.cpp:337: Failure
Expected equality of these values:
  V5.getInt()
    Which is: 124447888
  43

[  FAILED  ] InterpreterTest.Value (2937 ms)

To reproduce the other failed tests, add -fsanitize=undefined to the C/CXX flags. You will then need to replicate the commands that lit would run, and run those using qemu.

I will continue to look for a solution on native hardware.

DavidSpickett commented 1 month ago

Currently suspect this code: https://github.com/llvm/llvm-project/blob/9ece3eb1459309f9fbd18ce8ec8f771c238e8815/clang/lib/AST/Interp/InterpStack.h#L98

junaire commented 1 month ago

Hey I just wanna point out Interp and clang Interpreter are different things, clang/lib/AST/Interp is a new implementation for C++ constexpr evaluation, (https://clang.llvm.org/docs/ConstantInterpreter.html) and clang/lib/Interpreter is an interactive C++ interpreter that allows for incremental compilation. (https://clang.llvm.org/docs/ClangRepl.html)

DavidSpickett commented 1 month ago

Thank you, I had confused the two.

Which means the issue caused by https://github.com/llvm/llvm-project/pull/89811 is in clang-repl and the ones I'm seeing here are in the constant interpreter.

eymay commented 1 week ago

For those who have access to hardware, I could use the rest of the cross compilation guide for running on Raspberry Pi 4, Debian 12. I have used GCC 12.2 for glibc compatibility, instead of GCC 13.2

To reproduce by cross compiling on Linux:

Download toolchain from: https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads (https://developer.arm.com/-/media/Files/downloads/gnu/13.2.rel1/binrel/arm-gnu-toolchain-13.2.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz?rev=adb0c0238c934aeeaa12c09609c5e6fc&hash=68DA67DE12CBAD82A0FA4B75247E866155C93053).

GCC 12.2 available here: https://developer.arm.com/-/media/Files/downloads/gnu/12.2.rel1/binrel/arm-gnu-toolchain-12.2.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz?rev=9929cb6c0e8948f0ba1a621167fcd56d&hash=8F13B510F8B20149258CB4AA5372BF92