Open Quuxplusone opened 8 years ago
Hi Wei,
Sorry about that. Is it possible to get the IR being input into the SimplifyCFG pass, and how it differs?
If CFG sinking is disabled, I'd presume this testcase would regress significantly, because the selects would not even be produced. Is this correct, or does CFG sinking actually regress performance?
James
Attached r282452.ll
(1690 bytes, application/octet-stream): IR generated by r282452 before cfgsimplify
Attached r282453.ll
(1690 bytes, application/octet-stream): IR generated by r282453 before cfgsimplify
Hi James,
> Hi Wei,
>
> Sorry about that. Is it possible to get the IR being input into the
> SimplifyCFG pass, and how it differs?
I attached the IR.
>
> If CFG sinking is disabled, I'd presume this testcase would regress
> significantly, because the selects would not even be produced. Is this
> correct, or does CFG sinking actually regress performance?
>
> James
I tried opt -O2 -simplifycfg-sink-common=false on r282452.ll and r282453.ll,
and it could still generate the select and the final code was what we expect.
It is earlycse which helps to eliminate the redundent load.
Wei.
Hi Wei,
Thanks for this. What's happening seems to be that, in the r282453 case, the
load of %maxarray.addr is first in both conditional blocks. This allows
simplifycfg to *hoist* it, which makes sinking see that the GEPs are of the
same base pointer.
We have heuristic bailouts so we don't produce nasty GEPs, but a GEP with only
a variable (PHI) final operand is allowed.
I'm really not sure what to do here. GEPs with a single variable final operand
were even allowed to be sunk before I rewrote this code, so adding a bailout
for that could cause all sorts of problems...
The issue with this optimization is that some testcases *require* it to fire
before SROA, and others *require* it not to fire before SROA. It's very
difficult to please all consumers.
I'd like to get around to committing the GVN-sinking pass, which would allow us
to cripple simplifycfg to only handle the most trivial cases (without which
SROA can't handle some cases) and to do the heavy lifting *after SROA* in GVN-
sink.
In the meantime I'm open to suggestions about what to do for your testcase :(
Cheers,
James
(In reply to comment #5)
> Hi Wei,
>
> Thanks for this. What's happening seems to be that, in the r282453 case, the
> load of %maxarray.addr is first in both conditional blocks. This allows
> simplifycfg to *hoist* it, which makes sinking see that the GEPs are of the
> same base pointer.
>
> We have heuristic bailouts so we don't produce nasty GEPs, but a GEP with
> only a variable (PHI) final operand is allowed.
>
> I'm really not sure what to do here. GEPs with a single variable final
> operand were even allowed to be sunk before I rewrote this code, so adding a
> bailout for that could cause all sorts of problems...
>
Thanks for looking at the problem.
Could you be more specific or even better to have a testcase to show what kind
of problem we will see if we bailout for all GEPs?
>
> The issue with this optimization is that some testcases *require* it to fire
> before SROA, and others *require* it not to fire before SROA. It's very
> difficult to please all consumers.
>
> I'd like to get around to committing the GVN-sinking pass, which would allow
> us to cripple simplifycfg to only handle the most trivial cases (without
> which SROA can't handle some cases) and to do the heavy lifting *after SROA*
> in GVN-sink.
>
> In the meantime I'm open to suggestions about what to do for your testcase :(
I was thinking about whether extending instcombine may be easier. The testcase
can be fixed by by supporting the transformation of
load(gep(array, select(idx1, idx2))) ==> select(load(gep(array, idx1)),
load(gep(array, idx2))).
But we need to apply isSafeToLoadUnconditionally on a new GEP node without
actually inserting the new node into IR. It seems a problem.
In addition, we may have load(gep(gep...(select)))... It will be nasty.
Thanks,
Wei.
r282452.ll
(1690 bytes, application/octet-stream)r282453.ll
(1690 bytes, application/octet-stream)