Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

When using -mrtd and -ffreestanding, generated memcpy calls get the wrong calling convention #10853

Open Quuxplusone opened 12 years ago

Quuxplusone commented 12 years ago
Bugzilla Link PR10597
Status NEW
Importance P normal
Reported by Dimitry Andric (dimitry@andric.com)
Reported on 2011-08-06 12:18:52 -0700
Last modified on 2011-08-15 14:45:21 -0700
Version trunk
Hardware PC All
CC anton@korobeynikov.info, baldrick@free.fr, geek4civic@gmail.com, llvm-bugs@lists.llvm.org, pawel.worach@gmail.com, rdivacky@freebsd.org
Fixed by commit(s)
Attachments llvm-clang-10597.patch (2368 bytes, text/plain)
llvm-clang-10597-take2.patch (3545 bytes, text/plain)
Blocks
Blocked by
See also
When using -mrtd in combination with -ffreestanding, memcpy calls (and
possibly some other builtin function calls) get generated with the wrong
calling convention.

For example, consider the following program (which is meant to get its
memcpy implementation from a separate compilation unit):

//////////////////////////////////////////////////////////////////////
struct foo {
    int i[100];
};

void bar(void);
void memcpy(void *dst, const void *src, unsigned len);

void baz(struct foo *x, struct foo *y)
{
    bar();
    memcpy(x, y, sizeof(struct foo));
    bar();
    *x = *y; // causes memcpy() to be called
    bar();
}
//////////////////////////////////////////////////////////////////////

When you compile this with "-ffreestanding -mrtd", the resulting
assembly for the baz() function becomes:

//////////////////////////////////////////////////////////////////////
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %edi
        pushl   %esi
        subl    $16, %esp
        calll   bar
        movl    12(%ebp), %esi
        movl    %esi, 4(%esp)
        movl    8(%ebp), %edi
        movl    %edi, (%esp)
        movl    $400, 8(%esp)           # imm = 0x190
        calll   memcpy                  # (1)
        subl    $12, %esp               # (2)
        calll   bar
        movl    %esi, 4(%esp)
        movl    %edi, (%esp)
        movl    $400, 8(%esp)           # imm = 0x190
        calll   memcpy                  # (3)
        calll   bar
        addl    $16, %esp
        popl    %esi
        popl    %edi
        popl    %ebp
        ret     $8
//////////////////////////////////////////////////////////////////////

The first call to memcpy (1) is generated correctly, subtracting 12
bytes from the stack (2) to compensate for the 12 bytes that will have
been popped by the body of the memcpy function.

However, the second call to memcpy (3) has been generated by clang
itself, and has been assumed to have cdecl calling convention, even if
-mrtd is on the command line.  E.g. "subl $12, %esp" is not inserted
here, and the program will crash more or less spectacularly.

When you look at the corresponding .ll file, you will see:

//////////////////////////////////////////////////////////////////////
  tail call x86_stdcallcc void @bar() nounwind optsize
  %0 = bitcast %struct.foo* %x to i8*
  %1 = bitcast %struct.foo* %y to i8*
  tail call x86_stdcallcc void @memcpy(i8* %0, i8* %1, i32 400) nounwind optsize
  tail call x86_stdcallcc void @bar() nounwind optsize
  tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 400, i32 4, i1 false)
  tail call x86_stdcallcc void @bar() nounwind optsize
//////////////////////////////////////////////////////////////////////

So @llvm.memcpy.p0i8.p0i8.i32 is *not* x86_stdcallcc, and it results in
a cdecl memcpy call in assembly.

(Obviously a workaround is be to explicitly declare memcpy() as cdecl,
but that was impossible until bug 10591 was fixed.)
Quuxplusone commented 12 years ago
with the attached patch I am able to produce what seems to be correct IR to me:

pes ~/whirl$ clang -ffreestanding -mrtd -emit-llvm -O2 -S -o - dim.c
; ModuleID = 'dim.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-
f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-freebsd8.2"

%struct.foo = type { [100 x i32] }

define x86_stdcallcc void @baz(%struct.foo* %x, %struct.foo* %y) nounwind
uwtable {
entry:
  tail call x86_stdcallcc void @bar() nounwind
  %0 = bitcast %struct.foo* %x to i8*
  %1 = bitcast %struct.foo* %y to i8*
  tail call x86_stdcallcc void @memcpy(i8* %0, i8* %1, i32 400) nounwind
  tail call x86_stdcallcc void @bar() nounwind
  tail call x86_stdcallcc void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 400, i32 4, i1 false)
  tail call x86_stdcallcc void @bar() nounwind
  ret void
}

declare x86_stdcallcc void @bar()

declare x86_stdcallcc void @memcpy(i8*, i8*, i32)

declare x86_stdcallcc void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8*
nocapture, i64, i32, i1) nounwind

but the produced assembly is wrong. What is wrong about the IR?
Quuxplusone commented 12 years ago

Attached llvm-clang-10597.patch (2368 bytes, text/plain): patch

Quuxplusone commented 12 years ago

Attached llvm-clang-10597-take2.patch (3545 bytes, text/plain): patch