Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Performance regression when LTO is enabled #20002

Open Quuxplusone opened 10 years ago

Quuxplusone commented 10 years ago
Bugzilla Link PR20004
Status NEW
Importance P normal
Reported by learnopengles@gmail.com
Reported on 2014-06-11 07:47:59 -0700
Last modified on 2016-03-10 07:19:30 -0800
Version 3.4
Hardware Macintosh MacOS X
CC llvm-bugs@lists.llvm.org, rafael@espindo.la, shlomif@shlomifish.org, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Without LTO:

clang PerformanceTest.cpp dsp.cpp -std=c++11 -ffast-math -O3 -o PerformanceTest

Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.2.0
Thread model: posix

Iterations: 955
C results: 100,043,846 shorts per second.

With LTO:

clang PerformanceTest.cpp dsp.cpp -std=c++11 -ffast-math -flto -O3 -o
PerformanceTest

Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.2.0
Thread model: posix

Iterations: 586
C results: 61,349,033 shorts per second.

Code can be downloaded from here:
https://gist.github.com/learnopengles/004ff4eee75057ca006c
Quuxplusone commented 9 years ago

This LTO regression also exists on 3.6.

Quuxplusone commented 8 years ago

I'm running into a similar problem with Freecell Solver’s git master on Mageia Linux x86-64 Cauldron with clang-3.7.1-4.mga6. Using clang, I am getting a 9.00627183914185 seconds runtime without -flto and a 9.06325387954712 seconds with it (and with -fuse-ld=gold). Based on my post of https://groups.yahoo.com/neo/groups/fc-solve-discuss/conversations/messages/1442 the difference used to be much more dramatic in the past.