Open bald-man opened 2 years ago
@llvm/issue-subscribers-backend-arm
@kbeyls Kristof, this failure was reported internally in Android about a crash in the armv7 backend. Can you please help investigate this?
Attaching preprocessed source for easier repro:
$ clang -target armv7-linux-androideabi19 -O2 m3_compile.c
fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
I repro'd it at SHA af1328ef452b9eaa4e9f0bc115aeda8f40c4bbff
This looks related: https://github.com/llvm/llvm-project/issues/50758
Someone just pointed out that it's not reproing at wasm3 head. Apparently, we have an internal optimization that hasn't been upstreamed yet.
This is what my version of https://github.com/wasm3/wasm3/blob/main/source/m3_exec_defs.h looks like:
#if defined(__has_attribute) && __has_attribute(musttail) && !defined(__arm__)
/* Clang requires a special tail recursion attribute to use tail recursion. */
#define MUST_TAIL __attribute__((musttail))
#else
#define MUST_TAIL
#endif
#define d_m3RetSig static inline m3ret_t vectorcall
#define d_m3Op(NAME) M3_NO_UBSAN d_m3RetSig op_##NAME (d_m3OpSig)
#define nextOpImpl() ((IM3Operation)(* _pc))(_pc + 1, d_m3OpArgs)
#define jumpOpImpl(PC) ((IM3Operation)(* PC))( PC + 1, d_m3OpArgs)
#define nextOpDirect() MUST_TAIL return nextOpImpl()
#define jumpOpDirect(PC) MUST_TAIL return jumpOpImpl((pc_t)(PC))
It seems support for must-tail calls was only added quite recently in the Arm backend, see https://reviews.llvm.org/D102613?
Would you be able to reduce the reproducer, e.g. using creduce?
creduce followed by some manual reduction gives this:
$ cat repro.c
void bar(int h, int i, long j, double o) {
}
void p(int h, int i, long j, double o)
{
__attribute__((musttail)) return bar(h, i, j, o);
}
$ clang -target armv7-linux-androideabi19 repro.c
fatal error: error in backend: failed to perform tail call elimination on a call site marked musttail
This is very similar to https://github.com/llvm/llvm-project/issues/54025 / https://reviews.llvm.org/D120622
The ARM backend has this logic to find arguments passed in memory and block tail call optimization in such cases: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/ARM/ARMISelLowering.cpp#L3041
I think that logic needs to be disabled, and then the actual call lowering code needs to be audited to ensure things work. I believe the logic in question is trying to line up the stack slot offsets, which is not necessary for musttail, since the prototype check should ensure that.
Thanks for the reduced test case. When running this test case, it's easy to reproduce on top-of-trunk. When changing the test case to target armv7a-linux-gnueabihf, the case does not fail.
After stepping through https://github.com/llvm/llvm-project/blob/20451cb06b01167ff4614862736ca65bbfe46dfc/llvm/lib/Target/ARM/ARMISelLowering.cpp#L2955 in a debugger, it shows the difference is that on armv7a-linux-gnueabihf all arguments seem to be passed in register, while for target armv7-linux-androideabi19 it isn't. https://github.com/llvm/llvm-project/blob/20451cb06b01167ff4614862736ca65bbfe46dfc/llvm/lib/Target/ARM/ARMISelLowering.cpp#L3047 is the line which shows different behavior between armv7a-linux-gnueabihf and armv7-linux-androideabi19.
Indeed, https://developer.android.com/ndk/guides/abis#v7a documents that the Android armv7a abi uses -mfloat-abi=softfp, documenting "The compiler must pass all float values in integer registers and all double values in integer register pairs when making function calls.".
So, it seems (after stepping through the debugger a bit more) that with -mfloat-abi=softfp, for the 4th argument, it needs to be passed in memory as no more integer registers are available for passing the double? And in the -mfloat-abi=hardfp case (the default for -target armv7a-linux-gnueabihf), it seems all arguments are passed in register.
Long story short, it seems that a function with signature "void f(int, int, long, double)" cannot be guaranteed to be tail called under the Android arm7a abi, even if it can do so under the linux armv7a gnueabihf ABI.
It's unclear to me if this function could still be tail called, and the checking logic for musttail call is a bit too conservative, or indeed under the Android armv7a ABI, it's simply impossible to tail call such functions.
It seems to me that the solution might be to adapt wasm3 source code so it doesn't require a musttail call for functions that cannot be tail called under the Android armv7a ABI....
https://github.com/android/ndk/issues/1973 reports another instance of this issue.
In this instance, the function needs arguments passed via the stack due to the number of arguments:
static void resolveToBufferSlow(JSString*, JSString*, JSString*, CharacterType* buffer, unsigned length, uint8_t* stackLimit);
See #67767 too
Is it possible for clang to query the backend and error out rather than letting the backend crash?
It's still not clear to me why we can't just fix the backend. If the caller and callee prototypes and ABIs match, we should be able to do it, Kristof's comments notwithstanding. Moving the error earlier to the frontend doesn't seem like that much of an improvement.
What would it take to get ARM to prioritize fixing musttail for armv7? This is an ongoing portability problem, you can see multiple issues in the tracker referring to arm32 musttail issues. Using sret or more than 4 arguments prevents tail call elimination, but the musttail attribute makes it a correctness requirement.
https://github.com/wasm3/wasm3/blob/main/source/m3_env.c compilation fails with
To repro: