llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.79k stars 10.97k forks source link

When using -mrtd and -ffreestanding, generated memcpy calls get the wrong calling convention #10969

Open DimitryAndric opened 12 years ago

DimitryAndric commented 12 years ago
Bugzilla Link 10597
Version trunk
OS All
CC @asl,@pwo

Extended Description

When using -mrtd in combination with -ffreestanding, memcpy calls (and possibly some other builtin function calls) get generated with the wrong calling convention.

For example, consider the following program (which is meant to get its memcpy implementation from a separate compilation unit):

////////////////////////////////////////////////////////////////////// struct foo { int i[100]; };

void bar(void); void memcpy(void dst, const void src, unsigned len);

void baz(struct foo x, struct foo y) { bar(); memcpy(x, y, sizeof(struct foo)); bar(); x = y; // causes memcpy() to be called bar(); } //////////////////////////////////////////////////////////////////////

When you compile this with "-ffreestanding -mrtd", the resulting assembly for the baz() function becomes:

////////////////////////////////////////////////////////////////////// pushl %ebp movl %esp, %ebp pushl %edi pushl %esi subl $16, %esp calll bar movl 12(%ebp), %esi movl %esi, 4(%esp) movl 8(%ebp), %edi movl %edi, (%esp) movl $400, 8(%esp) # imm = 0x190 calll memcpy # (1) subl $12, %esp # (2) calll bar movl %esi, 4(%esp) movl %edi, (%esp) movl $400, 8(%esp) # imm = 0x190 calll memcpy # (3) calll bar addl $16, %esp popl %esi popl %edi popl %ebp ret $8 //////////////////////////////////////////////////////////////////////

The first call to memcpy (1) is generated correctly, subtracting 12 bytes from the stack (2) to compensate for the 12 bytes that will have been popped by the body of the memcpy function.

However, the second call to memcpy (3) has been generated by clang itself, and has been assumed to have cdecl calling convention, even if -mrtd is on the command line. E.g. "subl $12, %esp" is not inserted here, and the program will crash more or less spectacularly.

When you look at the corresponding .ll file, you will see:

////////////////////////////////////////////////////////////////////// tail call x86_stdcallcc void @​bar() nounwind optsize %0 = bitcast %struct.foo %x to i8 %1 = bitcast %struct.foo %y to i8 tail call x86_stdcallcc void @​memcpy(i8 %0, i8 %1, i32 400) nounwind optsize tail call x86_stdcallcc void @​bar() nounwind optsize tail call void @​llvm.memcpy.p0i8.p0i8.i32(i8 %0, i8 %1, i32 400, i32 4, i1 false) tail call x86_stdcallcc void @​bar() nounwind optsize //////////////////////////////////////////////////////////////////////

So @​llvm.memcpy.p0i8.p0i8.i32 is not x86_stdcallcc, and it results in a cdecl memcpy call in assembly.

(Obviously a workaround is be to explicitly declare memcpy() as cdecl, but that was impossible until bug 10591 was fixed.)

llvmbot commented 12 years ago

patch

llvmbot commented 12 years ago

patch

llvmbot commented 12 years ago

with the attached patch I am able to produce what seems to be correct IR to me:

pes ~/whirl$ clang -ffreestanding -mrtd -emit-llvm -O2 -S -o - dim.c ; ModuleID = 'dim.c' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-unknown-freebsd8.2"

%struct.foo = type { [100 x i32] }

define x86_stdcallcc void @​baz(%struct.foo %x, %struct.foo %y) nounwind uwtable { entry: tail call x86_stdcallcc void @​bar() nounwind %0 = bitcast %struct.foo %x to i8 %1 = bitcast %struct.foo %y to i8 tail call x86_stdcallcc void @​memcpy(i8 %0, i8 %1, i32 400) nounwind tail call x86_stdcallcc void @​bar() nounwind tail call x86_stdcallcc void @​llvm.memcpy.p0i8.p0i8.i64(i8 %0, i8 %1, i64 400, i32 4, i1 false) tail call x86_stdcallcc void @​bar() nounwind ret void }

declare x86_stdcallcc void @​bar()

declare x86_stdcallcc void @​memcpy(i8, i8, i32)

declare x86_stdcallcc void @​llvm.memcpy.p0i8.p0i8.i64(i8 nocapture, i8 nocapture, i64, i32, i1) nounwind

but the produced assembly is wrong. What is wrong about the IR?

llvmbot commented 11 months ago

@llvm/issue-subscribers-clang-codegen