I'm working on an embedded ARM project using LLVM to do cross compilation, in which we need to do whole-program transformations on both the application and libraries (statically linked newlib and compiler-rt). We therefore chose to compile and link everything with LTO. However, miscompilation happens where a printf("\n") in C is compiled into a call to @llvm.trap() and an unreachable instruction in IR when we link against an LTO version of newlib.
After some debugging, we found out that the problem is caused by mismatched calling convention. Specifically,
In the compiling stage, LibCallSimplifier replaces the original call to printf() with ARM AAPCS calling convention by a call to putchar() with LLVM's default calling convention.
In the function importing stage of LTO, the function body of putchar() with ARM AAPCS calling convention is imported from the LTO newlib into the single module.
I don't know if LLVM should support compiling low-level infrastructure (such as C library) with LTO, but my speculation is that LibCallSimplifier assumes the C library to be in native code which makes it disregard the original call's calling convention.
Another interesting fact is that this is not only the case with LTO, but also with ThinLTO. Specifically, I tried the following combinations, some of which are actually working:
application:LTO + newlib:LTO -> unreachable
application:ThinLTO + newlib:ThinLTO -> unreachable
application:LTO + newlib:ThinLTO -> working
application:ThinLTO + newlib:LTO -> working
For now I'm using one of the last two working configurations for my project, but it would be great to see all of the above working.
Extended Description
Hi,
I'm working on an embedded ARM project using LLVM to do cross compilation, in which we need to do whole-program transformations on both the application and libraries (statically linked newlib and compiler-rt). We therefore chose to compile and link everything with LTO. However, miscompilation happens where a printf("\n") in C is compiled into a call to @llvm.trap() and an unreachable instruction in IR when we link against an LTO version of newlib.
After some debugging, we found out that the problem is caused by mismatched calling convention. Specifically,
In the compiling stage, LibCallSimplifier replaces the original call to printf() with ARM AAPCS calling convention by a call to putchar() with LLVM's default calling convention.
In the function importing stage of LTO, the function body of putchar() with ARM AAPCS calling convention is imported from the LTO newlib into the single module.
When running InstCombine pass in LTO, due to mismatched calling convention (default in the CallInst vs. ARM AAPCS in putchar()'s definition), the CallInst is transformed into an unreachable instruction (An explanation can be found in https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it).
I don't know if LLVM should support compiling low-level infrastructure (such as C library) with LTO, but my speculation is that LibCallSimplifier assumes the C library to be in native code which makes it disregard the original call's calling convention.
Another interesting fact is that this is not only the case with LTO, but also with ThinLTO. Specifically, I tried the following combinations, some of which are actually working:
application:LTO + newlib:LTO -> unreachable application:ThinLTO + newlib:ThinLTO -> unreachable application:LTO + newlib:ThinLTO -> working application:ThinLTO + newlib:LTO -> working
For now I'm using one of the last two working configurations for my project, but it would be great to see all of the above working.