Open Quuxplusone opened 12 years ago
Bugzilla Link | PR12869 |
Status | NEW |
Importance | P enhancement |
Reported by | Kostya Serebryany (kcc@google.com) |
Reported on | 2012-05-18 09:03:43 -0700 |
Last modified on | 2012-05-20 13:17:19 -0700 |
Version | trunk |
Hardware | PC Linux |
CC | anton@korobeynikov.info, baldrick@free.fr, evan.cheng@apple.com, llvm-bugs@lists.llvm.org, nicholas@mxc.ca, nlewycky@google.com, resistor@mac.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Kostya, what if you provide -combiner-aa argument to llc ?
(In reply to comment #1)
> Kostya, what if you provide -combiner-aa argument to llc ?
Actually, the code for "foo" is longer and for "bar" - shorter. "llc -combiner-
alias-analysis -combiner-global-alias-analysis" makes the output identical.
Sorting blocks into loads+ops+stores is one of my old todo-list wishlist items. It makes a lot of things easier to analyze, and lets backends do trivial load-fusion and store-fusion. We should do this as an IR pass, and it should turn @bar into @foo.
>> Actually, the code for "foo" is longer and for "bar" - shorter.
Sure. Meant to say "for one function", not "for function one".
>> "llc -combiner-alias-analysis -combiner-global-alias-analysis" makes the
output identical.
Coolness! Any plans to enable this by default?
OTOH Nick's suggestion to implement this on IR level makes sense too.
(In reply to comment #4)
> Coolness! Any plans to enable this by default?
> OTOH Nick's suggestion to implement this on IR level makes sense too.
Well... it's "experimental" for something like 2 or 3 years already... Maybe
Evan or Owen will comment why it's not turned on yet...
It's poorly tested, expensive, and showed little benefit on most test suites when we tried it. Beyond that, it's rapidly being superseded by Andy's new scheduler work.
(In reply to comment #6)
> It's poorly tested, expensive, and showed little benefit on most test suites
> when we tried it. Beyond that, it's rapidly being superseded by Andy's new
> scheduler work.
Well, I know at least 2 cases when it provided much better results:
1. EEMBC on ARM
2. It's impossible to match mem-mem instructions w/o them. Right now the only
target in the tree which has mem-mem instructions is msp430 :)
(In reply to comment #6)
> Beyond that, it's rapidly being superseded by Andy's new scheduler work.
How the scheduler will help to fold memory operands, btw?