Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[missed optimisation] Reading a callee-saved register writes to the stack #22734

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR22735
Status NEW
Importance P enhancement
Reported by Adam Warner (adam.warner.nz@gmail.com)
Reported on 2015-02-27 17:04:17 -0800
Last modified on 2018-10-25 20:11:54 -0700
Version trunk
Hardware PC Linux
CC llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
There is no way to inform the compiler one is reading a callee-saved register
without the compiler also writing the register to the stack:

read_callee_saved_register.c:
#include <stdint.h>

uint64_t read_rbx(void) {
  register uint64_t rbx asm ("rbx");
  uint64_t value;
  asm ("mov %[rbx], %[value]" : [value] "=r" (value) : [rbx] "r" (rbx));
  return value;
}

int main(void) {
  return 0;
}

The code above explicitly tells the complier that local register variable rbx
is only being read. Here is the output of gcc and clang respectively:

$ gcc-snapshot.sh -O3 read_callee_saved_register.c && objdump -d -m i386:x86-64
a.out|less
00000000004004c0 <read_rbx>:
  4004c0:       48 89 d8                mov    %rbx,%rax
  4004c3:       53                      push   %rbx
  4004c4:       5b                      pop    %rbx
  4004c5:       c3                      retq
  4004c6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  4004cd:       00 00 00

$ clang-3.7 -O3 read_callee_saved_register.c && objdump -d -m i386:x86-64
a.out|less
00000000004004e0 <read_rbx>:
  4004e0:       53                      push   %rbx
  4004e1:       48 89 d8                mov    %rbx,%rax
  4004e4:       5b                      pop    %rbx
  4004e5:       c3                      retq
  4004e6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  4004ed:       00 00 00

Neither gcc nor clang should be pushing and popping rbx.

Note: gcc already trusts that rbx is merely being read since the assembly is
inserted before rbx is pushed and popped off the stack.
Quuxplusone commented 9 years ago
GCC bug report:
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65247>
Quuxplusone commented 8 years ago
The unnecessary pushing and popping has generally been fixed in clang 3.7 and
above (thanks!) But the simple function read_rbx() may expose a long-standing
bug in clang:

#include <stdint.h>

uint64_t read_rbx(void) {
  register uint64_t rbx asm ("rbx");
  return rbx;
}

uint64_t read_rbx2(void) {
  register uint64_t rbx asm ("rbx");
  uint64_t val;
  asm ("mov %1, %0" : "=r" (val) : "r" (rbx));
  return val;
}

int main(void) {
  return 0;
}

$ clang-3.7 -O3 read_rbx.c && objdump -d -m i386:x86-64:intel a.out|less

00000000004004c0 <read_rbx>:
  4004c0:       c3                      ret
  4004c1:       66 66 66 66 66 66 2e    data16 data16 data16 data16 data16 nop WORD PTR cs:[rax+rax*1+0x0]
  4004c8:       0f 1f 84 00 00 00 00
  4004cf:       00

00000000004004d0 <read_rbx2>:
  4004d0:       48 89 d8                mov    rax,rbx
  4004d3:       c3                      ret
  4004d4:       66 66 66 2e 0f 1f 84    data16 data16 nop WORD PTR cs:[rax+rax*1+0x0]
  4004db:       00 00 00 00 00

According to GCC developer Richard Biener
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65247#c2> read_rbx() is intended
to return the value of RBX in RAX.