llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.91k stars 11.52k forks source link

ARM Cortex-A9 optimization improvement (maybe) #7829

Open llvmbot opened 14 years ago

llvmbot commented 14 years ago
Bugzilla Link 7457
Version trunk
OS Windows XP
Attachments minimal C++ program that demonstrates the case
Reporter LLVM Bugzilla Contributor
CC @asl,@efriedma-quic,@rengolin

Extended Description

This is a minuscule improvement, but since I spent some effort looking at it, here it is:

ARM Cortex-A9 Thumb2

    lsrs    r2, r0, #​8
    lsrs    r3, r1, #​8
    uxtb    r2, r2
    uxtb    r3, r3

might be better coded as:

    ubfx    r2, r0, #​8, #​8
    ubfx    r3, r1, #​8, #​8

The code space is the same, since lsrs and uxtb are 16 bit instructions, whereas ubfx is 32 bit, but I think the ubfx pair will execute in one clock, whereas the first sequence will take two clocks (assuming a dual-issue CPU).

(But in a universe that contains 10^80 electrons, it really might not matter that much.)

efriedma-quic commented 13 years ago

Reduced IR: target triple = "thumbv7-apple-darwin11" define arm_aapcscc void @​_Z7checkrxv(i32 %tmp15, i32 %tmp18, i32 %tmp42, i32 %tmp45) nounwind { %cmp22 = icmp eq i32 %tmp18, %tmp15 %tmp117 = lshr i32 %tmp45, 8 %tmp118 = trunc i32 %tmp117 to i8 %tmp109 = lshr i32 %tmp42, 8 %tmp110 = trunc i32 %tmp109 to i8 %cmp51 = icmp eq i8 %tmp118, %tmp110 %and5687 = and i1 %cmp22, %cmp51 br i1 %and5687, label %if.then81, label %if.end85 if.then81: tail call arm_aapcscc void @​_Z5myLogj(i32 252) nounwind br label %if.end85 if.end85: ret void } declare arm_aapcscc void @​_Z5myLogj(i32)

Not sure why the lsrs+uxtb isn't getting matched into ubfx.