Open bylaws opened 1 month ago
@llvm/issue-subscribers-backend-aarch64
Author: Billy Laws (bylaws)
Thinking about it, perhaps always emitting exit thunks would be fine? libcalls must always be available to link to anyway I have something like:
diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 310b152ef981..a377043ab8d1 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -89,6 +89,7 @@ private:
Module *M = nullptr;
Type *PtrTy;
+ Type *I32Ty;
Type *I64Ty;
Type *VoidTy;
@@ -789,6 +790,7 @@ bool AArch64Arm64ECCallLowering::runOnModule(Module &Mod) {
cfguard_module_flag = MD->getZExtValue();
PtrTy = PointerType::getUnqual(M->getContext());
+ I32Ty = Type::getInt32Ty(M->getContext());
I64Ty = Type::getInt64Ty(M->getContext());
VoidTy = Type::getVoidTy(M->getContext());
@@ -840,6 +842,31 @@ bool AArch64Arm64ECCallLowering::runOnModule(Module &Mod) {
}
SetVector<GlobalValue *> DirectCalledFns;
+
+ if (!M->getFunction("memcpy")) {
+ Function *MemCpy =
+ Function::Create(FunctionType::get(PtrTy, {PtrTy, PtrTy, I64Ty}, false), GlobalValue::WeakODRLinkage, 0, "memcpy", M);
+ DirectCalledFns.insert(MemCpy);
+ }
+
+ if (!M->getFunction("memmove")) {
+ Function *MemCpy =
+ Function::Create(FunctionType::get(PtrTy, {PtrTy, PtrTy, I64Ty}, false), GlobalValue::WeakODRLinkage, 0, "memmove", M);
+ DirectCalledFns.insert(MemCpy);
+ }
+
+ if (!M->getFunction("memset")) {
+ Function *MemSet =
+ Function::Create(FunctionType::get(PtrTy, {PtrTy, I32Ty, I64Ty}, false), GlobalValue::WeakODRLinkage, 0, "memset", M);
+ DirectCalledFns.insert(MemSet);
+ }
+
for (Function &F : Mod)
if (!F.isDeclaration() &&
F.getCallingConv() != CallingConv::ARM64EC_Thunk_Native &&
The list of possibly-relevant "libcalls" is, unfortunately, extremely long. memcpy/memmove/memset, yes, but also a bunch of compiler-rt utility methods, and a bunch of math routines. Listing them out would be a pain, and generating thunks for all of them would seriously bloat the code. To avoid that, we basically need to generate thunks after isel, which is tricky to implement. (Specifically, the interaction with the pass manager to run isel on the newly generated functions is hard.)
Long-term, probably LLVM is moving in the direction of reducing the number of libcalls generated by isel, in favor of doing stuff in pre-isel passes, but that doesn't really solve the issue at the moment.
I haven't found this to cause practical problems because normally you'd link arm64ec code against an arm64ec MSVCRT (which provides arm64-native implementations that can be directly called).
I haven't found this to cause practical problems because normally you'd link arm64ec code against an arm64ec MSVCRT (which provides arm64-native implementations that can be directly called).
Yeah I don't this ends up affecting usecases outside of wine - but within wine many programs ship with their own ucrtbase and that is explicitly used over the system one due to differences in e.g. math functions. I found memcpy/memmove/memset to be 'enough' for these cases I have ran into though so perhaps that would be reasonable in the short term.
Compiling e.g.
will result in a call to the llvm.memcpy intrinsic that is then lowered to an actual memcpy call as part of ISel after AArch64Arm64ECCallLowering has ran, resulting in no exit thunk being emitted. This is similarly the case for all other libcalls and for many of them it's not possible to determine if they will be used at the time the EC call lowering runs. Emitting exit thunks for all possible libcalls would work but is obviously not ideal
CC: @cjacek @efriedma-quic