Open Quuxplusone opened 15 years ago
Bugzilla Link | PR4357 |
Status | NEW |
Importance | P normal |
Reported by | Evan Cheng (evan.cheng@apple.com) |
Reported on | 2009-06-10 12:43:58 -0700 |
Last modified on | 2014-10-11 00:04:10 -0700 |
Version | trunk |
Hardware | PC All |
CC | anton@korobeynikov.info, devang.patel@gmail.com, freik@fb.com, llvm-bugs@lists.llvm.org, llvm@sunfishcode.online, nicholas@mxc.ca, quickslyver@free.fr |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Another test case. This is even worse.
int hcf(int a, int b)
{
if (a == 0) return b;
else if (a < b) return hcf(a, b-a);
else return hcf(a-b, b);
}
I don't understand why you use the term "even more horrible" meaning that llvm-
gcc code is more horrible than icc code.
I don't see any difference in complexity between icc code and llvm-gcc code
it seems that the "-simplifycfg" pass near the end of the optimization passes
generate the bad code with clang.
Without this passe clang output the following code that is very close to icc
code: both have 2 mov, 4 jmp/jc , 1 tst, 1 cmp, 2 sub, 1 ret
hcf:
.LBB1_0: # entry
movl 8(%esp), %eax
movl 4(%esp), %ecx
.align 16
.LBB1_1: # while.cond.outer
testl %ecx, %ecx
je .LBB1_5 # while.end.split
jmp .LBB1_3 # while.cond
.align 16
.LBB1_2: # if.then
subl %ecx, %eax
.LBB1_3: # while.cond
cmpl %eax, %ecx
jl .LBB1_2 # if.then
.LBB1_4: # if.else.split
subl %eax, %ecx
jmp .LBB1_1 # while.cond.outer
.LBB1_5: # while.end.split
ret
Is this one any better lately?
no, there is still a problem with clang
I've got a fix for this (it's not SimplifyCFG, it's an overly simplistic heuristic in GVN), but I need to see if the solution is worse than the problem in other perf situations. The inner loop top alignment is probably also a poor decision, but that's a different issue entirely.
Also: the code coming out of ARMv7 codegen without the fix is even worse than the x86 output.