Open llvmbot opened 8 years ago
Confirmed: missed optimization.
Today's llvm for aarch64 produces 7 loads
str x19, [sp, #-32]! // 8-byte Folded Spill
stp x29, x30, [sp, #​16] // 8-byte Folded Spill
ldr x8, [x0, #​8]
stur xzr, [x8, #-8]
ldr x8, [x0, #​8]
sub x19, x8, #​8 // =8
str x19, [x0, #​8]
ldur x0, [x8, #-8]
add x29, sp, #​16 // =16
cbz x0, .LBB0_2
// %bb.1: ldr x8, [x0] ldr x8, [x8, #8] blr x8 .LBB0_2: str xzr, [x19] ldp x29, x30, [sp, #16] // 8-byte Folded Reload ldr x19, [sp], #32 // 8-byte Folded Reload ret
versus only one load when compiling with gcc trunk:
ldr x1, [x0, 8]
sub x1, x1, #​8
str x1, [x0, 8]
ret
Extended Description
The following code:
include
include
struct base { virtual ~base() {} };
void f(std::vector<std::unique_ptr >& v)
{
v.back().release();
v.pop_back();
}
GCC 6.2 is able to optimize to just two instructions:
The code generated by clang 3.9.0 is less efficient:
.LBB0_2: mov qword ptr [rbx], 0 pop rbx ret
As [rax - 8] is reloaded after subtraction I tend believe it is an alias analysis issue.