Open Quuxplusone opened 9 years ago
Bugzilla Link | PR24817 |
Status | CONFIRMED |
Importance | P normal |
Reported by | Carrot (carrot@google.com) |
Reported on | 2015-09-14 17:08:17 -0700 |
Last modified on | 2017-05-24 09:56:40 -0700 |
Version | trunk |
Hardware | PC Linux |
CC | hfinkel@anl.gov, kit.barton@gmail.com, llvm-bugs@lists.llvm.org, nemanja.i.ibm@gmail.com, wschmidt@linux.vnet.ibm.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
The reason this happens is that we have the AddedComplexity set on VSX
instructions to favour them and the VSX indexed loads/stores are defined within
that block.
Of course, the exact same thing happens with double precision values. However,
this isn't P8 specific since we've had indexed double precision VSX loads in P7
as well.
The easy solution for this problem is to extract the VSX indexed loads from the
AddedComplexity block. However, we will then never load such values in the
upper VSX registers.
(In reply to comment #1)
> The reason this happens is that we have the AddedComplexity set on VSX
> instructions to favour them and the VSX indexed loads/stores are defined
> within that block.
> Of course, the exact same thing happens with double precision values.
> However, this isn't P8 specific since we've had indexed double precision VSX
> loads in P7 as well.
>
> The easy solution for this problem is to extract the VSX indexed loads from
> the AddedComplexity block. However, we will then never load such values in
> the upper VSX registers.
It might be better to peephole this after register allocation. You want the
larger register classes to be available during RA for high-register-pressure
situations. If that turns out to be unnecessary, we could relax after the fact.
This approach is not exactly optimal, but seems better than the other
relatively-easy alternatives.
Since Power9 has scalar D-Form loads and stores that can operate on the full
VSX register set, this is not an issue on Power9. The code produced for this on
Power9 is the same as the code on Power7.
I think that given the limitation of Power8 in terms of not having D-Form loads
for all the VSX registers along with the intent to favour VSX due to the larger
register set, we should leave the Power8 code as is and close this PR. Please
let me know what you think.
Carrot, are you OK with closing this PR as a limitation on P8?
It's OK to me.