Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

after fast regalloc: *** Bad machine code: Using an undefined physical register *** #32649

Closed Quuxplusone closed 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33677
Status RESOLVED FIXED
Importance P enhancement
Reported by Jonas Paulsson (paulsson@linux.vnet.ibm.com)
Reported on 2017-07-03 00:54:24 -0700
Last modified on 2017-07-07 12:27:54 -0700
Version trunk
Hardware PC Linux
CC ditaliano@apple.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, paulsson@linux.vnet.ibm.com, quentin.colombet@gmail.com
Fixed by commit(s)
Attachments tc_rafast.ll (3004 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 18744
reduced testcase

bin/llc -verify-machineinstrs -regalloc=fast

BB#6:
    Predecessors according to CFG: BB#3
        %vreg9:subreg_hl32<def> = COPY %vreg9:subreg_l32; GR128Bit:%vreg9
        %vreg17<def> = COPY %vreg9:subreg_l64; GR64Bit:%vreg17 GR128Bit:%vreg9
        J <BB#5>
    Successors according to CFG: BB#5(?%)

->

# After Fast Register Allocator
BB#6:
    Predecessors according to CFG: BB#3
        %R0Q<def> = L128 <fi#0>, 0, %noreg; mem:LD16[FixedStack0](align=8)
        %R0L<def> = COPY %R1L, %R0Q<imp-use,kill>
        %R2D<def> = COPY %R1D
        ST128 %R0Q<kill>, <fi#0>, 0, %noreg; mem:ST16[FixedStack0](align=8)
        STG %R2D<kill>, <fi#1>, 0, %noreg; mem:ST8[FixedStack1]
        J <BB#5>
    Successors according to CFG: BB#5(?%)

*** Bad machine code: Using an undefined physical register ***
- function:    main
- basic block: BB#6  (0x432b5cb0)
- instruction: %R2D<def> = COPY
- operand 1:   %R1D
Quuxplusone commented 7 years ago

Attached tc_rafast.ll (3004 bytes, text/plain): reduced testcase

Quuxplusone commented 7 years ago

Please note this was found on SystemZ (-mtriple=s390x-linux-gnu), and I haven't checked if it is triggered on any other target.

Quuxplusone commented 7 years ago

I am not saying we shouldn't fix the problem, but you're using a configuration that is not well tested. You run the fast regalloc in optimized mode, i.e., with potentially complicated live-ranges. In particular, it is not unlikely that the fast regalloc does not set the proper undef flags when sub registers are used.

In other words, if you want to workaround the problem use the fast regalloc only in O0 (or at least with coalescing disabled -join-liveintervals=false).

Quuxplusone commented 7 years ago
I'd argue the problem is that we wrongly add a kill flag on the implicit use on
that instruction:
%R0L<def> = COPY %R1L, %R0Q<imp-use,kill>

Looking.
Quuxplusone commented 7 years ago
I talked with Matthias and he said that with subreg liveness we usually put
both the implicit-use and implicit-def.
So instead of not setting the kill flag, the proper fix would be to add an
implicit-def of the full register.
Quuxplusone commented 7 years ago
Putting my thoughts here.
Actually, looking closer at the issue, I am back to my original though that we
shouldn't set the kill in that example.
The implicit-use + implicit-def trick makes sense only if they happen on the
same instruction, which is the case in the example, but is wrong if the last-
use since this new definition is far appart.
E.g.,
%R0L<def> = COPY %R1L, %R0Q<imp-use,kill>, %R0Q<imp-def>
Is fine and makes sense, but
... = OP1 %R1L, %R0Q<imp-use,kill>
[...]
%R0L<def> = OP2 ... %R0Q<imp-def>
Is not fine and potentially dangerous because one would be able to reuse R1L
between OP1 and OP2.

Anyhow, going to remove the kill flags when subreg are involved. That's going
to be conservatively correct.
Quuxplusone commented 7 years ago
Remove the wrong kill flag as r307428.
Thanks for the report.