Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[DebugInfo@O2] SelectionDAGs debug inst scheduling is imperfect and often broken #40553

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR41583
Status NEW
Importance P normal
Reported by Jeremy Morse (jeremy.morse.llvm@gmail.com)
Reported on 2019-04-24 09:27:02 -0700
Last modified on 2019-04-24 09:28:26 -0700
Version trunk
Hardware PC Linux
CC aprantl@apple.com, bjorn.a.pettersson@ericsson.com, chackz0x12@gmail.com, greg.bedwell@sony.com, llvm-bugs@lists.llvm.org, orlando.hyams@sony.com, paul.robinson@am.sony.com, stephen.tozer@sony.com
Fixed by commit(s)
Attachments sched.ll (33744 bytes, text/plain)
ubd.ll (192889 bytes, text/plain)
Blocks
Blocked by
See also PR40010
Hi,

[Vaguer titles are available on request],

tl;dr: some parts of SelectionDAGs debug variable location handling consider
where a VReg will be defined; some parts consider the correct ordering of
DBG_VALUE insts with regard to how instructions have been scheduled; but
nothing considers both at the same time. Values can be placed in both short
and long term VRegs, which mixes badly with stricter VReg liveness rules
for debug info I've been trying to implement.

This is a writeup of some trouble that's been brewing in PR40010 / D56151. To
summarise the story so far:
 * We discovered having a DBG_VALUE refer to a dead VReg presents problems
   for post-SSA pre-regalloc optimisation passes,
 * D56151 [-1] conservatively eliminates some of those DBG_VALUEs,
 * It turns out SelectionDAG is generating dead VReg references where it
   could be avoided, which D56151 causes to be dropped.

In the D56151 comments Bjorn presents some examples of where the
conservative nature of the patch creates avoidable regressions. Up until
now it's not been important for SelectionDAG to ensure that DBG_VALUEs
have live operands, as even debug-use-before-def is tolerated, but that's
now no longer the case. Here's an X86 example of what Bjorn encountered (I
can cough up a full function):

block:
  %Var23 = getelementptr inbounds %"val", %"val"* %Var20, i64 0, i32 1
  %loaded = load i8, i8* %Var23
  %cmp24 = icmp eq i8 %loaded, 17
  %casted = bitcast %"val"* %Var20
  call void @llvm.dbg.value(metadata %"smthn"* %casted, [...])
  br i1 %cmp24, label %looper, label %return

Important points: %Var20 is a live-in VReg during isel, and %casted a
live-out / exported VReg. We get the following DAG:

t0: ch = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %4,
  t12: ch = CopyToReg t0, Register:i64 %5, t2,
    t7: i8,ch = MOV8rm<Mem:(load 1 from %ir.Var23)> t2,
              TargetConstant:i8<1>, Register:i64 $noreg,
              TargetConstant:i32<8>, Register:i32 $noreg, t0,
  t21: i8,i32 = SUB8ri t7, TargetConstant:i8<17>,
t27: ch,glue = CopyToReg t12, Register:i32 $eflags, t21:1
  t24: ch = JCC_1 BasicBlock:ch<lor.lhs.false.i.i.i.i.i 0x835c0e8>,
            TargetConstant:i8<4>, t27, t27:1
t16: ch = JMP_1 BasicBlock:ch<return 0x835c668>, t24

And the following MIR (where VReg %4 == IR %Var20 and
VReg %5 == IR %casted):

    %25:gr8 = MOV8rm %4, 1, $noreg, 8, $noreg :: (load 1 from %ir.Var23)
    %26:gr8 = SUB8ri %25, 17, implicit-def $eflags
    %5:gr64 = COPY %4
    DBG_VALUE %4, $noreg, !9788, !DIExpression()
    JCC_1 %bb.2, 4, implicit $eflags
    JMP_1 %bb.8

Observe: the DBG_VALUE %4 lands after a copy from %4, and due to the
structure of the surrounding program, is dead after that COPY. That is
considered to be invalid by the code in D56151. In terms
of an initial cause: the DBG_VALUE gets %4 instead of %5 because the
SelectionDAG ValueMap (from Values to SDNodes) points %casted => t2, the
CopyFromReg, rather than the CopyToReg. This is because SelectionDAGBuilder
can see the bitcast is a no-op. The situation is repeated throughout
SelectionDAG (not just no-op copies), in that there's a difference between
when the value is _computed_, and when it is then copied to its long-term
(i.e. exported to other blocks) VReg.

Moving on from what actually happens, to what we _could_ do with
SelectionDAG, in our existing implementation there are two important
choices in these situations, using the example above to illustrate:
 * Whether the DBG_VALUE refers to %4 or %5, CopyFromReg or CopyToReg,
   or more generally where the value is computed and where it is assigned
   to its long-term VReg,
 * Which SDNode the SDDbgValue is attached to, if at all, which gives us
   three block locations the DBG_VALUE can be inserted at: after the
   CopyFromReg, after the CopyToReg, or where the IR location numbers say
   it should go [0].

That gives six combinations of operand/location. IMO, all the combinations
are flawed in some way, which I'll enumerate:
  %4@CopyFromReg: DBG_VALUE potentially emitted much earlier than it should
     be, at whatever the earliest use of the VReg is, likely out of order,
  %5@CopyFromReg: Guaranteed debug use-before-def,
  %4@CopyToReg: DBG_VALUE potentially refers to a dead VReg,
  %5@CopyToReg: VReg is definitely live, but potentially scheduled much
    later than it should be. See the attached [1], where a CopyToReg is
    scheduled far beyond where its source SDNode is defined,
  %4@IrLocationNo: DBG_VALUE potentially refers to a dead VReg,
  %5@IrLocationNo: Both potentially too late *or* use-before-def, depending
    on the scheduling. This use-before-def scenario happens already it
    turns out, see the attached [2].

Right now, SelectionDAG picks %4@CopyFromReg, and falls back to
%4@IrLocationNo if it can't emit the DBG_VALUE there. The dead debug VReg
use in the example above comes from the %4@IrLocationNo fallback.

To repeat the tl;dr at the top of this PR, and considering those
combinations, nothing in the current LLVM implementation considers both the
location and the VReg that DBG_VALUEs can refer to at the same time.
Making non-live DBG_VALUE operands invalid just highlights the problem,
and that there's no overall safe option to pick. Some kind of new and
automagical implementation would be needed to fix this (IMHO, YMMV).

[I'll propose some kind of compromise for D56151 as I really need to draw a
line under the patches currently in flight].

[-1] https://reviews.llvm.org/D56151
[0] https://github.com/llvm/llvm-
project/blob/master/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp#L916
[1] See attached "sched.ll", compile with "llc input.ll -o -
-stop-before=expand-isel-pseudos" (I have a slightly old r358013) and
examine bb.16, labelled "_ZN4llvm7APFloatC2ERKNS_12fltSemanticsE.exit" in
the source IR. There, The result of the first call instruction is copied
("%40:gr32 = COPY %163") to an exported VReg much much later than the
defining instruction (the call) occurs.
[2] See attached "ubd.ll", compiled as with [1], and observe in bb.1 / the
"if.end" block, the DBG_VALUE for %add / !1771 lands before the
instruction that computes it. This happens because the icmp instruction
gets folded into the subtract, forcing earlier dbg.value intrinsics to be
placed ahead of the subtract. Scanning a clang-3.4 build for these
scenarios reveals 36265 debug use-before-defs where the def is in the
same block; I believe those are all due to SelectionDAG as placeDbgValues
fixes up any IR use-before-defs right now. Bonus points: this example is
fixed/erased by D56151, and would present illegal variable values to the
developer otherwise.
Quuxplusone commented 5 years ago

Attached sched.ll (33744 bytes, text/plain): Example of a CopyToReg being scheduled very late in a block

Quuxplusone commented 5 years ago

Attached ubd.ll (192889 bytes, text/plain): Example of SelectionDAG generating a use-before-def after unfortunate IR inst scheduling