llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.91k stars 11.51k forks source link

[AArch64][PeepholeOptimizer] Look through PHIs to find additional register sources #24879

Open llvmbot opened 9 years ago

llvmbot commented 9 years ago
Bugzilla Link 24505
Version trunk
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @bcardosolopes,@kbeyls,@qcolombet

Extended Description

Bruno recently committed a change to improve the peephole optimizer. He was specifically targeting x86, but this can be easily extended to other architectures by marking target-specific instructions in the form "one source + one destination bitcast" with "isBitcast."

The specific commit is r245442 http://llvm.org/viewvc/llvm-project?view=revision&revision=245442

[PeepholeOptimizer] Look through PHIs to find additional register sources

Reapply r243486.

With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows:

A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C

B: por %mm1, %mm0 movd %mm0, %r9 jmp C

C: movd %r9, %mm0 pshufw $238, %mm0, %mm0

Becomes:

A: psllq %mm1, %mm0 jmp C

B: por %mm1, %mm0 jmp C

C: pshufw $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526

Bruno did later revert the commit in r245446. Regardless, once the final patch lands we should consider investigating.

llvmbot commented 9 years ago

Thank you, Quentin.

I tried to add isBitcast in td files before and it did not work.

The reason is that the real move instruction, fmov, is generated after the peephole optimization in AArch64. In the peeophole optimization, it is represented by the pseudo COPY instruction.

qcolombet commented 9 years ago

I am porting this optimization to AArch64 and find followings:

In x86, copies between different data types use different instructions such as MMX_MOVD64grr so that we can add flags for individual copy instructions like Bruno did.

In aarch64, the peephole optimization pass uses the standard pseudo COPY instruction to represent all data movement operations between scalar and vector registers. This COPY instruction is lowered to fmov in a later pass called “Post-RA pseudo instruction expansion pass” by calling AArch64InstrInfo::copyPhysReg.

Is there a good way to label the fmov instruction of aarch64 earlier so that the peephole optimization can recognize it?

You should add isbitcast in the td file.

llvmbot commented 9 years ago

I am porting this optimization to AArch64 and find followings:

In x86, copies between different data types use different instructions such as MMX_MOVD64grr so that we can add flags for individual copy instructions like Bruno did.

In aarch64, the peephole optimization pass uses the standard pseudo COPY instruction to represent all data movement operations between scalar and vector registers. This COPY instruction is lowered to fmov in a later pass called “Post-RA pseudo instruction expansion pass” by calling AArch64InstrInfo::copyPhysReg.

Is there a good way to label the fmov instruction of aarch64 earlier so that the peephole optimization can recognize it?

bcardosolopes commented 9 years ago

Reintroduced in r245479.

Some useful notes:

1) This commit enables PHI lookup only for uncoalescable copy like instructions, such as cross reg class bitcasts. "one source + one destination bitcast" is one form of uncoalescable copies, but there are other forms that are currently handled, see PeepholeOptimizer::isUncoalescableCopy()

2) Although this commit introduced PHI lookup for uncoalescable copy like instructions, everything need to support PHI lookup for coalescable copies is there as well, I only didn't enabled it since I didn't get time to test it, this is also something worth investigating (maybe even more profitable than the uncoalescable case).