llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.06k stars 11.59k forks source link

Fast register allocator produces code that uses a lot of stack #10447

Open llvmbot opened 13 years ago

llvmbot commented 13 years ago
Bugzilla Link 10075
Version trunk
OS All
Attachments bitcode testcase
Reporter LLVM Bugzilla Contributor
CC @stoklund

Extended Description

I know that the main objective of the fast register allocator is speed, but I am getting test failures -O0 because the tests run out of stack space :-(

The results I got for the largest function by recompiling the .ii with gcc and clang:

clang O0 0x00004690 gcc O0 0x00001560

clang O1 0x00000868 gcc O1 0x00000b98

clang O2 0x00000aa8 gcc O2 0x00000bd8

clang O3 0x00000ab8 gcc O3 0x00000bd8

clang Os 0x000010b8 gcc Os 0x00000698

I then found that most of the -O0 to -O1 difference was because of the register allocator:

$ llc jsinterp.bc -o jsinterp.o -filetype=obj -regalloc=greedy -O0 $ otool -t -v jsinterp.o | grep -A 8 __ZN2js9InterpretEP9JSContextPNS_10StackFrameEjNS_10InterpModeE | grep sub.*rsp

000000000000001a subq $0x00001c78,%rsp

$ llc jsinterp.bc -o jsinterp.o -filetype=obj -regalloc=fast -O0 $ otool -t -v jsinterp.o | grep -A 8 __ZN2js9InterpretEP9JSContextPNS_10StackFrameEjNS_10InterpModeE | grep sub.*rsp

0000000000000010 subq $0x000045d8,%rsp

1ba3d143-a64b-4671-82b2-0b31cfb91709 commented 13 years ago

r132900 helps a bit:

__ZN2js9InterpretEP9JSContextPNS_10StackFrameEjNS_10InterpModeE: pushq %rbx subq $14112, %rsp ## imm = 0x3720

1ba3d143-a64b-4671-82b2-0b31cfb91709 commented 13 years ago

Another problem is PHI elimination. It creates extra join registers that are also spilled. This could be improved in the common case where there is no critical edges.

1ba3d143-a64b-4671-82b2-0b31cfb91709 commented 13 years ago

Clang is trying to be clever and sometimes passes temporaries between blocks in registers, even creating some phis.

RAFast spills all global live ranges, so it isn't really helping.