llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.14k stars 12.02k forks source link

Musttail is causing crash when compiling wasm3 for android armv7-a #57069

Open bald-man opened 2 years ago

bald-man commented 2 years ago

https://github.com/wasm3/wasm3/blob/main/source/m3_env.c compilation fails with

fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)

To repro:

clang --target=arm-linux-androideabi19 -march=armv7-a  -isystem clang/include -iquote=wasm3/source -c wasm3/source/m3_env.c -o /tmp/m3_env.o
llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-arm

pirama-arumuga-nainar commented 2 years ago

@kbeyls Kristof, this failure was reported internally in Android about a crash in the armv7 backend. Can you please help investigate this?

Attaching preprocessed source for easier repro:

$ clang -target armv7-linux-androideabi19 -O2 m3_compile.c 
fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail

m3_compile.zip

bald-man commented 2 years ago

I repro'd it at SHA af1328ef452b9eaa4e9f0bc115aeda8f40c4bbff

bald-man commented 2 years ago

This looks related: https://github.com/llvm/llvm-project/issues/50758

bald-man commented 2 years ago

Someone just pointed out that it's not reproing at wasm3 head. Apparently, we have an internal optimization that hasn't been upstreamed yet.

This is what my version of https://github.com/wasm3/wasm3/blob/main/source/m3_exec_defs.h looks like:

#if defined(__has_attribute) && __has_attribute(musttail) && !defined(__arm__)
/* Clang requires a special tail recursion attribute to use tail recursion. */
#define MUST_TAIL __attribute__((musttail))
#else
#define MUST_TAIL
#endif

#define d_m3RetSig                  static inline m3ret_t vectorcall
#define d_m3Op(NAME)                M3_NO_UBSAN d_m3RetSig op_##NAME (d_m3OpSig)

#define nextOpImpl()                ((IM3Operation)(* _pc))(_pc + 1, d_m3OpArgs)
#define jumpOpImpl(PC)              ((IM3Operation)(*  PC))( PC + 1, d_m3OpArgs)

#define nextOpDirect()              MUST_TAIL return nextOpImpl()
#define jumpOpDirect(PC)            MUST_TAIL return jumpOpImpl((pc_t)(PC))
kbeyls commented 2 years ago

It seems support for must-tail calls was only added quite recently in the Arm backend, see https://reviews.llvm.org/D102613?

Would you be able to reduce the reproducer, e.g. using creduce?

pirama-arumuga-nainar commented 2 years ago

creduce followed by some manual reduction gives this:

$ cat repro.c
void bar(int h, int i, long j, double o) {
}

void p(int h, int i, long j, double o)
{
  __attribute__((musttail)) return bar(h, i, j, o);
}
$ clang -target armv7-linux-androideabi19 repro.c
fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
rnk commented 2 years ago

This is very similar to https://github.com/llvm/llvm-project/issues/54025 / https://reviews.llvm.org/D120622

The ARM backend has this logic to find arguments passed in memory and block tail call optimization in such cases: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMISelLowering.cpp#L3041

I think that logic needs to be disabled, and then the actual call lowering code needs to be audited to ensure things work. I believe the logic in question is trying to line up the stack slot offsets, which is not necessary for musttail, since the prototype check should ensure that.

kbeyls commented 2 years ago

Thanks for the reduced test case. When running this test case, it's easy to reproduce on top-of-trunk. When changing the test case to target armv7a-linux-gnueabihf, the case does not fail.

After stepping through https://github.com/llvm/llvm-project/blob/20451cb06b01167ff4614862736ca65bbfe46dfc/llvm/lib/Target/ARM/ARMISelLowering.cpp#L2955 in a debugger, it shows the difference is that on armv7a-linux-gnueabihf all arguments seem to be passed in register, while for target armv7-linux-androideabi19 it isn't. https://github.com/llvm/llvm-project/blob/20451cb06b01167ff4614862736ca65bbfe46dfc/llvm/lib/Target/ARM/ARMISelLowering.cpp#L3047 is the line which shows different behavior between armv7a-linux-gnueabihf and armv7-linux-androideabi19.

Indeed, https://developer.android.com/ndk/guides/abis#v7a documents that the Android armv7a abi uses -mfloat-abi=softfp, documenting "The compiler must pass all float values in integer registers and all double values in integer register pairs when making function calls.".

So, it seems (after stepping through the debugger a bit more) that with -mfloat-abi=softfp, for the 4th argument, it needs to be passed in memory as no more integer registers are available for passing the double? And in the -mfloat-abi=hardfp case (the default for -target armv7a-linux-gnueabihf), it seems all arguments are passed in register.

Long story short, it seems that a function with signature "void f(int, int, long, double)" cannot be guaranteed to be tail called under the Android arm7a abi, even if it can do so under the linux armv7a gnueabihf ABI.

It's unclear to me if this function could still be tail called, and the checking logic for musttail call is a bit too conservative, or indeed under the Android armv7a ABI, it's simply impossible to tail call such functions.

It seems to me that the solution might be to adapt wasm3 source code so it doesn't require a musttail call for functions that cannot be tail called under the Android armv7a ABI....

pirama-arumuga-nainar commented 12 months ago

https://github.com/android/ndk/issues/1973 reports another instance of this issue.

In this instance, the function needs arguments passed via the stack due to the number of arguments:

static void resolveToBufferSlow(JSString*, JSString*, JSString*, CharacterType* buffer, unsigned length, uint8_t* stackLimit);
davemgreen commented 12 months ago

See #67767 too

hiraditya commented 11 months ago

Is it possible for clang to query the backend and error out rather than letting the backend crash?

rnk commented 11 months ago

It's still not clear to me why we can't just fix the backend. If the caller and callee prototypes and ABIs match, we should be able to do it, Kristof's comments notwithstanding. Moving the error earlier to the frontend doesn't seem like that much of an improvement.

rnk commented 6 months ago

What would it take to get ARM to prioritize fixing musttail for armv7? This is an ongoing portability problem, you can see multiple issues in the tracker referring to arm32 musttail issues. Using sret or more than 4 arguments prevents tail call elimination, but the musttail attribute makes it a correctness requirement.