Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Missing tail calls for large structs #34038

Open Quuxplusone opened 6 years ago

Quuxplusone commented 6 years ago
Bugzilla Link PR35065
Status NEW
Importance P enhancement
Reported by Jeff Muizelaar (jmuizelaar@mozilla.com)
Reported on 2017-10-24 12:54:35 -0700
Last modified on 2017-10-24 14:20:51 -0700
Version trunk
Hardware PC All
CC hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm@sunfishcode.online, rnk@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
The following code:

struct Foo {
   int o[16];
};

__attribute__((noinline))
Foo moo()
{
        return {0};
}

Foo goo()
{
        return moo();
}

compiles to:

moo(): # @moo()
  xorps xmm0, xmm0
  movups xmmword ptr [rdi + 48], xmm0
  movups xmmword ptr [rdi + 32], xmm0
  movups xmmword ptr [rdi + 16], xmm0
  movups xmmword ptr [rdi], xmm0
  mov rax, rdi
  ret
goo(): # @goo()
  push rbx
  mov rbx, rdi
  call moo()
  mov rax, rbx
  pop rbx
  ret

goo could just be:

goo():
  jmp moo
Quuxplusone commented 6 years ago

This is an artifact of the need to return RDI in RAX for struct-returning functions. LLVM was never taught to take advantage of this ABI quirk, partly because I only implemented it for 32-bit in r237175 from 2014.

We might want to change our generate code in clang to return the pointer and put the 'returned' attribute on the sret parameter.

ARM probably has the same problem, since it applies to constructors there. I wonder if they get this tail call or not.

Quuxplusone commented 6 years ago

If both the caller and callee are sret, and the caller passes its sret argument to the callee's sret parameter, as in the testcase here, it seems like it should be safe, because the callee will set %RAX.