Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Compile-time explosion in Live Debug Variables #32702

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR33730
Status NEW
Importance P normal
Reported by Robert Lougher (rob.lougher@gmail.com)
Reported on 2017-07-10 10:42:27 -0700
Last modified on 2017-07-13 23:42:09 -0700
Version trunk
Hardware All All
CC aprantl@apple.com, dblaikie@gmail.com, ditaliano@apple.com, greg.bedwell@sony.com, llvm-bugs@lists.llvm.org, paul_robinson@playstation.sony.com, quentin.colombet@gmail.com, warren_ristow@playstation.sony.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
The following program takes over 40 minutes to compile with -g (3.5 GHz Core-
i7):

=============================== foo.c ======================================
extern int foobar(int, int, int, int, int);

int F(int i1, int i2, int i3, int i4, int i5) {
  return foobar(i1, i2, i3, i4, i5);
}

#define L3(a, b, c, d, e) \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e)

#define L2(a, b, c, d, e) \
  L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + \
  L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e)

#define L1(a, b, c, d, e) \
  L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e) + \
  L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e) + L2(a,b,c,d,e)

int foo(int a, int b, int c, int d, int e) {
  return L1(a,b,c,d,e);
}
================================================================================

$ time clang foo.c -O2 -c -g -mllvm -stats 2>&1 | grep livedebug

 6384827 livedebugvars         - Number of DBG_VALUEs inserted

real    42m55.033s
user    42m39.912s
sys 0m12.339s

From the stats, we can see that Live Debug Variables has inserted over 6
million DBG_VALUEs.  Compiling without -g it takes less than a minute:

$ time clang foo.c -O2 -c

real    0m47.132s
user    0m47.067s
sys 0m0.012s

The program creates 1500 identical inlined calls to foobar().  Each inlined
call generates 5 DBG_VALUEs for i1, i2, etc.  All of the inlined calls are
contained within the same basic-block.

Live Debug Variables is split into 3 parts.  The first part is an initial
analysis pass which removes the DBG_VALUEs (those that refer to virtual
registers), and replaces them with data structures that track the live user
variables.  This pass occurs before register allocation.

As mentioned above, each variable i1, i2, etc. has 1500 DBG_VALUEs.  Each of
these produces a separate entry in the data structure as they have different
debug locations (so there are now 1500 user variables for i1, 1500 for i2,
etc.).

Each set of 1500 user variables have the same location (virtual register
containing a copy of the incoming parameter a, b, c, etc.).  When computing the
live intervals, each user variable ends up with an interval corresponding
mostly to the live interval of the virtual register.  This covers almost the
entire basic-block, spanning all of the inlined calls.  I believe this is the
fundamental cause of the problem.  Each of the original DBG_VALUEs are
associated to only a small part of the entire live interval, corresponding to a
single inlined call.  However, by giving all the user variables the live range
of the virtual register they become associated to all the inlined calls.

The next part of Live Debug Variables occurs during register allocation.  When
splitting a virtual register Live Debug Variables is called to split the
corresponding user values.  In the above program, each virtual register is the
location for 1500 user values.  This means when the register allocator splits a
virtual register we end up splitting 1500 user values in Live Debug Variables.

When splitting a user value the interval corresponding to the location is
replaced by two new intervals (and two new locations).  If one of the new
locations is subsequently split again the number of intervals will grow to
record all the live intervals.  In the above test program, each debug value is
split over a thousand times (roughly once per inlined function), resulting in
over a thousand intervals per value (each set of 1500 values are split
identically).

Finally, after register allocation, the Debug Variables pass uses the updated
data structures to emit new DBG_VALUEs with physical registers (the call to
emitDebugValues() actually occurs during the Virtual Register Rewriter).

The emitDebugValues() function iterates over the user values and emits a new
DBG_VALUE for each interval/location after mapping the  virtual registers to
physical (it also attempts to coalesce locations).  As we have approx.
1500*1500*5 intervals it is easy to see why over 6 million DBG_VALUEs are
inserted (after coalescing).

The large number of DBG_VALUEs are a consequence of the initial live interval
given to the user values.  As discussed above, each of the original virtual
registers are split repeatedly, until we end up with intervals corresponding
roughly to the inlined calls.  As the user value intervals cover most of the
basic-block, all these splits are recorded within each user value.  However,
the scope of each original DBG_VALUE only corresponded to a single inlined
call.  Most of the intervals within each user value are outside this scope, and
hence most of the inserted DEBUG_VALUEs are bogus.

To illustrate the above discussion we can see what happens with 4 inlined calls
to foobar (no splitting occurs with 2 calls, and even 4 only gives limited
splitting).

=======================================================
extern int foobar(int, int, int, int, int);

int F(int i1, int i2, int i3, int i4, int i5) {
  return foobar(i1, i2, i3, i4, i5);
}

int foo(int a, int b, int c, int d, int e) {
  return F(a,b,c,d,e) +
         F(a,b,c,d,e) +
         F(a,b,c,d,e) +
         F(a,b,c,d,e);
}
=======================================================

The IR on input to Live Debug Variables (ADJCALLSTACK removed for clarity):

0B  BB#0: derived from LLVM BB %entry
        Live Ins: %EDI %ESI %EDX %ECX %R8D
        DBG_VALUE %EDI, %noreg, !"a", <!17>; line no:7
        DBG_VALUE %ESI, %noreg, !"b", <!17>; line no:7
        DBG_VALUE %EDX, %noreg, !"c", <!17>; line no:7
        DBG_VALUE %ECX, %noreg, !"d", <!17>; line no:7
        DBG_VALUE %R8D, %noreg, !"e", <!17>; line no:7
16B     %vreg4<def> = COPY %R8D; GR32:%vreg4
        DBG_VALUE %vreg4, %noreg, !"e", <!17>; GR32:%vreg4 line no:7
32B     %vreg3<def> = COPY %ECX; GR32:%vreg3
        DBG_VALUE %vreg3, %noreg, !"d", <!17>; GR32:%vreg3 line no:7
48B     %vreg2<def> = COPY %EDX; GR32:%vreg2
        DBG_VALUE %vreg2, %noreg, !"c", <!17>; GR32:%vreg2 line no:7
64B     %vreg1<def> = COPY %ESI; GR32:%vreg1
        DBG_VALUE %vreg1, %noreg, !"b", <!17>; GR32:%vreg1 line no:7
80B     %vreg0<def> = COPY %EDI; GR32:%vreg0
        DBG_VALUE %vreg0, %noreg, !"a", <!17>; GR32:%vreg0 line no:7
        DBG_VALUE %vreg1, %noreg, !"i2", <!17>; GR32:%vreg1 line no:3 inlined @[
f10.c:8:10 ]
        DBG_VALUE %vreg1, %noreg, !"i2", <!17>; GR32:%vreg1 line no:3 inlined @[
f10.c:9:10 ]
        DBG_VALUE %vreg1, %noreg, !"i2", <!17>; GR32:%vreg1 line no:3 inlined @[
f10.c:10:10 ]
        DBG_VALUE %vreg1, %noreg, !"i2", <!17>; GR32:%vreg1 line no:3 inlined @[
f10.c:11:10 ]
        DBG_VALUE %vreg2, %noreg, !"i3", <!17>; GR32:%vreg2 line no:3 inlined @[
f10.c:8:10 ]
        DBG_VALUE %vreg0, %noreg, !"i1", <!17>; GR32:%vreg0 line no:3 inlined @[
f10.c:8:10 ]
        DBG_VALUE %vreg2, %noreg, !"i3", <!17>; GR32:%vreg2 line no:3 inlined @[
f10.c:9:10 ]
        DBG_VALUE %vreg2, %noreg, !"i3", <!17>; GR32:%vreg2 line no:3 inlined @[
f10.c:10:10 ]
        DBG_VALUE %vreg2, %noreg, !"i3", <!17>; GR32:%vreg2 line no:3 inlined @[
f10.c:11:10 ]
        DBG_VALUE %vreg3, %noreg, !"i4", <!17>; GR32:%vreg3 line no:3 inlined @[
f10.c:8:10 ]
        DBG_VALUE %vreg3, %noreg, !"i4", <!17>; GR32:%vreg3 line no:3 inlined @[
f10.c:9:10 ]
        DBG_VALUE %vreg3, %noreg, !"i4", <!17>; GR32:%vreg3 line no:3 inlined @[
f10.c:10:10 ]
        DBG_VALUE %vreg3, %noreg, !"i4", <!17>; GR32:%vreg3 line no:3 inlined @[
f10.c:11:10 ]
        DBG_VALUE %vreg4, %noreg, !"i5", <!17>; GR32:%vreg4 line no:3 inlined @[
f10.c:8:10 ]
        DBG_VALUE %vreg4, %noreg, !"i5", <!17>; GR32:%vreg4 line no:3 inlined @[
f10.c:9:10 ]
        DBG_VALUE %vreg4, %noreg, !"i5", <!17>; GR32:%vreg4 line no:3 inlined @[
f10.c:10:10 ]
        DBG_VALUE %vreg4, %noreg, !"i5", <!17>; GR32:%vreg4 line no:3 inlined @[
f10.c:11:10 ]
112B        %EDI<def> = COPY %vreg0; GR32:%vreg0 dbg:f10.c:4:10 @[ f10.c:8:10 ]
128B        %ESI<def> = COPY %vreg1; GR32:%vreg1 dbg:f10.c:4:10 @[ f10.c:8:10 ]
144B        %EDX<def> = COPY %vreg2; GR32:%vreg2 dbg:f10.c:4:10 @[ f10.c:8:10 ]
160B        %ECX<def> = COPY %vreg3; GR32:%vreg3 dbg:f10.c:4:10 @[ f10.c:8:10 ]
176B        %R8D<def> = COPY %vreg4; GR32:%vreg4 dbg:f10.c:4:10 @[ f10.c:8:10 ]
192B        CALL64pcrel32 <ga:@foobar>, ...
224B        %vreg5<def> = COPY %EAX<kill>; GR32:%vreg5 dbg:f10.c:4:10 @[ f10.c:8:10 ]
        DBG_VALUE %vreg0, %noreg, !"i1", <!17>; GR32:%vreg0 line no:3 inlined @[
f10.c:9:10 ]
256B        %EDI<def> = COPY %vreg0; GR32:%vreg0 dbg:f10.c:4:10 @[ f10.c:9:10 ]
272B        %ESI<def> = COPY %vreg1; GR32:%vreg1 dbg:f10.c:4:10 @[ f10.c:9:10 ]
288B        %EDX<def> = COPY %vreg2; GR32:%vreg2 dbg:f10.c:4:10 @[ f10.c:9:10 ]
304B        %ECX<def> = COPY %vreg3; GR32:%vreg3 dbg:f10.c:4:10 @[ f10.c:9:10 ]
320B        %R8D<def> = COPY %vreg4; GR32:%vreg4 dbg:f10.c:4:10 @[ f10.c:9:10 ]
336B        CALL64pcrel32 <ga:@foobar>, ...
368B        %vreg7<def> = COPY %EAX<kill>; GR32:%vreg7 dbg:f10.c:4:10 @[ f10.c:9:10 ]
400B        %vreg7<def,tied1> = ADD32rr %vreg7<tied0>, %vreg5, %EFLAGS<imp-def,dead>;
...
        DBG_VALUE %vreg0, %noreg, !"i1", <!17>; GR32:%vreg0 line no:3 inlined @[
f10.c:10:10 ]
432B        %EDI<def> = COPY %vreg0; GR32:%vreg0 dbg:f10.c:4:10 @[ f10.c:10:10 ]
448B        %ESI<def> = COPY %vreg1; GR32:%vreg1 dbg:f10.c:4:10 @[ f10.c:10:10 ]
464B        %EDX<def> = COPY %vreg2; GR32:%vreg2 dbg:f10.c:4:10 @[ f10.c:10:10 ]
480B        %ECX<def> = COPY %vreg3; GR32:%vreg3 dbg:f10.c:4:10 @[ f10.c:10:10 ]
496B        %R8D<def> = COPY %vreg4; GR32:%vreg4 dbg:f10.c:4:10 @[ f10.c:10:10 ]
512B        CALL64pcrel32 <ga:@foobar>, ...
544B        %vreg9<def> = COPY %EAX<kill>; GR32:%vreg9 dbg:f10.c:4:10 @[ f10.c:10:10 ]
576B        %vreg9<def,tied1> = ADD32rr %vreg9<tied0>, %vreg7, %EFLAGS<imp-def,dead>;
...
        DBG_VALUE %vreg0, %noreg, !"i1", <!17>; GR32:%vreg0 line no:3 inlined @[
f10.c:11:10 ]
608B        %EDI<def> = COPY %vreg0; GR32:%vreg0 dbg:f10.c:4:10 @[ f10.c:11:10 ]
624B        %ESI<def> = COPY %vreg1; GR32:%vreg1 dbg:f10.c:4:10 @[ f10.c:11:10 ]
640B        %EDX<def> = COPY %vreg2; GR32:%vreg2 dbg:f10.c:4:10 @[ f10.c:11:10 ]
656B        %ECX<def> = COPY %vreg3; GR32:%vreg3 dbg:f10.c:4:10 @[ f10.c:11:10 ]
672B        %R8D<def> = COPY %vreg4; GR32:%vreg4 dbg:f10.c:4:10 @[ f10.c:11:10 ]
688B        CALL64pcrel32 <ga:@foobar>, ...
720B        %vreg11<def> = COPY %EAX<kill>; GR32:%vreg11 dbg:f10.c:4:10 @[
f10.c:11:10 ]
752B        %vreg11<def,tied1> = ADD32rr %vreg11<tied0>, %vreg9, %EFLAGS<imp-
def,dead>; ...
768B        %EAX<def> = COPY %vreg11; GR32:%vreg11 dbg:f10.c:8:3
784B        RET 0, %EAX<kill>; dbg:f10.c:8:3

We can see 4 DBG_VALUEs per variable, each one associated with the same virtual
register (e.g.  i2 is held in %vreg1).  The user values computed by Live Debug
Variables:

!"a,7"   [0B;80r):0 [80r;608r):1 Loc0=%EDI Loc1=%vreg0
!"b,7"   [0B;64r):0 [64r;624r):1 Loc0=%ESI Loc1=%vreg1
!"c,7"   [0B;48r):0 [48r;640r):1 Loc0=%EDX Loc1=%vreg2
!"d,7"   [0B;32r):0 [32r;656r):1 Loc0=%ECX Loc1=%vreg3
!"e,7"   [0B;16r):0 [16r;672r):1 Loc0=%R8D Loc1=%vreg4
!"i2,3 @[f10.c:8:10]"    [80r;624r):0 Loc0=%vreg1
!"i2,3 @[f10.c:9:10]"    [80r;624r):0 Loc0=%vreg1
!"i2,3 @[f10.c:10:10]"   [80r;624r):0 Loc0=%vreg1
!"i2,3 @[f10.c:11:10]"   [80r;624r):0 Loc0=%vreg1
!"i3,3 @[f10.c:8:10]"    [80r;640r):0 Loc0=%vreg2
!"i1,3 @[f10.c:8:10]"    [80r;608r):0 Loc0=%vreg0
!"i3,3 @[f10.c:9:10]"    [80r;640r):0 Loc0=%vreg2
!"i3,3 @[f10.c:10:10]"   [80r;640r):0 Loc0=%vreg2
!"i3,3 @[f10.c:11:10]"   [80r;640r):0 Loc0=%vreg2
!"i4,3 @[f10.c:8:10]"    [80r;656r):0 Loc0=%vreg3
!"i4,3 @[f10.c:9:10]"    [80r;656r):0 Loc0=%vreg3
!"i4,3 @[f10.c:10:10]"   [80r;656r):0 Loc0=%vreg3
!"i4,3 @[f10.c:11:10]"   [80r;656r):0 Loc0=%vreg3
!"i5,3 @[f10.c:8:10]"    [80r;672r):0 Loc0=%vreg4
!"i5,3 @[f10.c:9:10]"    [80r;672r):0 Loc0=%vreg4
!"i5,3 @[f10.c:10:10]"   [80r;672r):0 Loc0=%vreg4
!"i5,3 @[f10.c:11:10]"   [80r;672r):0 Loc0=%vreg4
!"i1,3 @[f10.c:9:10]"    [224r;608r):0 Loc0=%vreg0
!"i1,3 @[f10.c:10:10]"   [400r;608r):0 Loc0=%vreg0
!"i1,3 @[f10.c:11:10]"   [576r;608r):0 Loc0=%vreg0

We can see most of the user values have an interval corresponding to the live
range of the location (e.g. i2, starts at offset 80 after the copies and ends
at the last use of %vreg1 at offset 624).

An example of splitting during register allocation:

selectOrSplit GR32:%vreg4 [16r,672r:0)  0@16r w=4.782197e-03
tryLocalSplit:  16r 176r 320r 496r 672r
Best local split range: 16r-320r, 5.444336e-03, 3 instrs

Splitting of the corresponding user values within Live Debug Variables:

Splitting Loc0  !"i5,3 @[f10.c:8:10]"    [80r;672r):0 Loc0=%vreg4
Split result:   !"i5,3 @[f10.c:8:10]"    [80r;328r):1 [328r;672r):0 Loc0=%vreg12
Loc1=%vreg13
Splitting Loc0  !"i5,3 @[f10.c:11:10]"   [80r;672r):0 Loc0=%vreg4
Split result:   !"i5,3 @[f10.c:11:10]"   [80r;328r):1 [328r;672r):0 Loc0=%vreg12
Loc1=%vreg13
Splitting Loc0  !"i5,3 @[f10.c:10:10]"   [80r;672r):0 Loc0=%vreg4
Split result:   !"i5,3 @[f10.c:10:10]"   [80r;328r):1 [328r;672r):0 Loc0=%vreg12
Loc1=%vreg13
Splitting Loc0  !"i5,3 @[f10.c:9:10]"    [80r;672r):0 Loc0=%vreg4
Split result:   !"i5,3 @[f10.c:9:10]"    [80r;328r):1 [328r;672r):0 Loc0=%vreg12
Loc1=%vreg13

The user values after register allocation (with virtual registers mapped to
physical):

!"a,7"   [0B;80r):0 [80r;608r):1 Loc0=%EDI Loc1=%R12D
!"b,7"   [0B;64r):0 [64r;624r):1 Loc0=%ESI Loc1=%R15D
!"c,7"   [0B;48r):0 [48r;472r):1 [472r;640r):2 Loc0=%EDX Loc1=%EBP Loc2=<fi#1>
!"d,7"   [0B;32r):0 [32r;312r):1 [312r;476r):2 [476r;656r):3 Loc0=%ECX
Loc1=%EBX Loc2=<fi#2> Loc3=%EBP
!"e,7"   [0B;16r):0 [16r;328r):1 [328r;672r):2 Loc0=%R8D Loc1=%R14D Loc2=%EBX
!"i2,3 @[f10.c:8:10]"    [80r;624r):0 Loc0=%R15D
!"i2,3 @[f10.c:9:10]"    [80r;624r):0 Loc0=%R15D
!"i2,3 @[f10.c:10:10]"   [80r;624r):0 Loc0=%R15D
!"i2,3 @[f10.c:11:10]"   [80r;624r):0 Loc0=%R15D
!"i3,3 @[f10.c:8:10]"    [80r;472r):0 [472r;640r):1 Loc0=%EBP Loc1=<fi#1>
!"i1,3 @[f10.c:8:10]"    [80r;608r):0 Loc0=%R12D
!"i3,3 @[f10.c:9:10]"    [80r;472r):0 [472r;640r):1 Loc0=%EBP Loc1=<fi#1>
!"i3,3 @[f10.c:10:10]"   [80r;472r):0 [472r;640r):1 Loc0=%EBP Loc1=<fi#1>
!"i3,3 @[f10.c:11:10]"   [80r;472r):0 [472r;640r):1 Loc0=%EBP Loc1=<fi#1>
!"i4,3 @[f10.c:8:10]"    [80r;312r):0 [312r;476r):1 [476r;656r):2 Loc0=%EBX
Loc1=<fi#2> Loc2=%EBP
!"i4,3 @[f10.c:9:10]"    [80r;312r):0 [312r;476r):1 [476r;656r):2 Loc0=%EBX
Loc1=<fi#2> Loc2=%EBP
!"i4,3 @[f10.c:10:10]"   [80r;312r):0 [312r;476r):1 [476r;656r):2 Loc0=%EBX
Loc1=<fi#2> Loc2=%EBP
!"i4,3 @[f10.c:11:10]"   [80r;312r):0 [312r;476r):1 [476r;656r):2 Loc0=%EBX
Loc1=<fi#2> Loc2=%EBP
!"i5,3 @[f10.c:8:10]"    [80r;328r):0 [328r;672r):1 Loc0=%R14D Loc1=%EBX
!"i5,3 @[f10.c:9:10]"    [80r;328r):0 [328r;672r):1 Loc0=%R14D Loc1=%EBX
!"i5,3 @[f10.c:10:10]"   [80r;328r):0 [328r;672r):1 Loc0=%R14D Loc1=%EBX
!"i5,3 @[f10.c:11:10]"   [80r;328r):0 [328r;672r):1 Loc0=%R14D Loc1=%EBX
!"i1,3 @[f10.c:9:10]"    [224r;608r):0 Loc0=%R12D
!"i1,3 @[f10.c:10:10]"   [400r;608r):0 Loc0=%R12D
!"i1,3 @[f10.c:11:10]"   [576r;608r):0 Loc0=%R12D

IR dump after Live Debug Variables has inserted new DBG_VALUEs (notice the
bogus values):

0B  BB#0: derived from LLVM BB %entry
        Live Ins: %ECX %EDI %EDX %ESI %R8D
        DBG_VALUE %EDI, %noreg, !"a", <!17>; line no:7
        DBG_VALUE %ESI, %noreg, !"b", <!17>; line no:7
        DBG_VALUE %EDX, %noreg, !"c", <!17>; line no:7
        DBG_VALUE %ECX, %noreg, !"d", <!17>; line no:7
        DBG_VALUE %R8D, %noreg, !"e", <!17>; line no:7
16B     %R14D<def> = COPY %R8D
        DBG_VALUE %R14D, %noreg, !"e", <!17>; line no:7
32B     %EBX<def> = COPY %ECX
        DBG_VALUE %EBX, %noreg, !"d", <!17>; line no:7
40B     MOV32mr <fi#2>, 1, %noreg, 0, %noreg, %EBX; mem:ST4[FixedStack2]
48B     %EBP<def> = COPY %EDX
        DBG_VALUE %EBP, %noreg, !"c", <!17>; line no:7
64B     %R15D<def> = COPY %ESI
        DBG_VALUE %R15D, %noreg, !"b", <!17>; line no:7
80B     %R12D<def> = COPY %EDI
        DBG_VALUE %R14D, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %R14D, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %R14D, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %R14D, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %EBX, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %EBP, %noreg, !"i3", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %EBP, %noreg, !"i3", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %EBP, %noreg, !"i3", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %R12D, %noreg, !"i1", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %EBP, %noreg, !"i3", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %R15D, %noreg, !"i2", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %R15D, %noreg, !"i2", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %R15D, %noreg, !"i2", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %R15D, %noreg, !"i2", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %R12D, %noreg, !"a", <!17>; line no:7
112B        %EDI<def> = COPY %R12D; dbg:f10.c:4:10 @[ f10.c:8:10 ]
128B        %ESI<def> = COPY %R15D; dbg:f10.c:4:10 @[ f10.c:8:10 ]
144B        %EDX<def> = COPY %EBP; dbg:f10.c:4:10 @[ f10.c:8:10 ]
160B        %ECX<def> = COPY %EBX; dbg:f10.c:4:10 @[ f10.c:8:10 ]
176B        %R8D<def> = COPY %R14D; dbg:f10.c:4:10 @[ f10.c:8:10 ]
192B        CALL64pcrel32 <ga:@foobar>, …
224B        MOV32mr <fi#0>, 1, %noreg, 0, %noreg, %EAX; mem:ST4[FixedStack0]
dbg:f10.c:4:10 @[ f10.c:8:10 ]
        DBG_VALUE %R12D, %noreg, !"i1", <!17>; line no:3 inlined @[ f10.c:9:10 ]
256B        %EDI<def> = COPY %R12D; dbg:f10.c:4:10 @[ f10.c:9:10 ]
272B        %ESI<def> = COPY %R15D; dbg:f10.c:4:10 @[ f10.c:9:10 ]
288B        %EDX<def> = COPY %EBP; dbg:f10.c:4:10 @[ f10.c:9:10 ]
300B        MOV32mr <fi#1>, 1, %noreg, 0, %noreg, %EBP; mem:ST4[FixedStack1]
dbg:f10.c:4:10 @[ f10.c:9:10 ]
304B        %ECX<def> = COPY %EBX<kill>; dbg:f10.c:4:10 @[ f10.c:9:10 ]
        DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE <fi#2>, 0, !"i4", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE <fi#2>, 0, !"d", <!17>; line no:7
320B        %R8D<def> = COPY %R14D; dbg:f10.c:4:10 @[ f10.c:9:10 ]
328B        %EBX<def> = COPY %R14D<kill>
        DBG_VALUE %EBX, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %EBX, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %EBX, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %EBX, %noreg, !"i5", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %EBX, %noreg, !"e", <!17>; line no:7
336B        CALL64pcrel32 <ga:@foobar>, …
368B        %R13D<def> = COPY %EAX; dbg:f10.c:4:10 @[ f10.c:9:10 ]
400B        %R13D<def,tied1> = ADD32rm %R13D<kill,tied0>, <fi#0>, 1, %noreg, 0,
%noreg, %EFLAGS<imp-def,dead>
        DBG_VALUE %R12D, %noreg, !"i1", <!17>; line no:3 inlined @[ f10.c:10:10 ]
432B        %EDI<def> = COPY %R12D; dbg:f10.c:4:10 @[ f10.c:10:10 ]
448B        %ESI<def> = COPY %R15D; dbg:f10.c:4:10 @[ f10.c:10:10 ]
464B        %EDX<def> = COPY %EBP<kill>; dbg:f10.c:4:10 @[ f10.c:10:10 ]
        DBG_VALUE <fi#1>, 0, !"i3", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE <fi#1>, 0, !"i3", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE <fi#1>, 0, !"i3", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE <fi#1>, 0, !"i3", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE <fi#1>, 0, !"c", <!17>; line no:7
476B        %EBP<def> = MOV32rm <fi#2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack2]
        DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:11:10 ]
        DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:10:10 ]
        DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:9:10 ]
        DBG_VALUE %EBP, %noreg, !"i4", <!17>; line no:3 inlined @[ f10.c:8:10 ]
        DBG_VALUE %EBP, %noreg, !"d", <!17>; line no:7
480B        %ECX<def> = COPY %EBP; dbg:f10.c:4:10 @[ f10.c:10:10 ]
496B        %R8D<def> = COPY %EBX; dbg:f10.c:4:10 @[ f10.c:10:10 ]
512B        CALL64pcrel32 <ga:@foobar>, …
544B        %R14D<def> = COPY %EAX; dbg:f10.c:4:10 @[ f10.c:10:10 ]
576B        %R14D<def,tied1> = ADD32rr %R14D<kill,tied0>, %R13D<kill>, %EFLAGS<imp-
def,dead>; dbg:f10.c:9:23
        DBG_VALUE %R12D, %noreg, !"i1", <!17>; line no:3 inlined @[ f10.c:11:10 ]
608B        %EDI<def> = COPY %R12D<kill>; dbg:f10.c:4:10 @[ f10.c:11:10 ]
624B        %ESI<def> = COPY %R15D<kill>; dbg:f10.c:4:10 @[ f10.c:11:10 ]
640B        %EDX<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack1]
dbg:f10.c:4:10 @[ f10.c:11:10 ]
656B        %ECX<def> = COPY %EBP<kill>; dbg:f10.c:4:10 @[ f10.c:11:10 ]
672B        %R8D<def> = COPY %EBX<kill>; dbg:f10.c:4:10 @[ f10.c:11:10 ]
688B        CALL64pcrel32 <ga:@foobar>, …
52B     %EAX<def,tied1> = ADD32rr %EAX<kill,tied0>, %R14D<kill>, %EFLAGS<imp-
def,dead>; dbg:f10.c:10:23
784B        RET 0, %EAX; dbg:f10.c:8:3

The compile time explosion (and the large number of DBG_VALUEs) does not occur
if the user value live intervals are constrained by placing them in different
basic-blocks.  For example the following program puts the 1500 inlined calls to
foobar into 10 basic-blocks (150 x 10):

=============================== foo2.c =======================================
extern int foobar(int, int, int, int, int);

int F(int i1, int i2, int i3, int i4, int i5) {
  return foobar(i1, i2, i3, i4, i5);
}

#define L3(a, b, c, d, e) \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + \
  F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e) + F(a,b,c,d,e)

#define L2(a, b, c, d, e) \
  L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + \
  L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e) + L3(a,b,c,d,e)

#define L1(a, b, c, d, e) ({ \
  int s = L2(a,b,c,d,e); \
  if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); \
  if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); \
  if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); if(s&1) s += L2(a,b,c,d,e); \
  s;})

int foo(int a, int b, int c, int d, int e) {
    return L1(a,b,c,d,e);
}
===============================================================================

$ time clang foo2.c -O2 -c -g -mllvm –stats 2>&1 | grep livedebug

  74433 livedebugvars         - Number of DBG_VALUEs inserted

real    0m21.490s
user    0m21.342s
sys 0m0.036s
Quuxplusone commented 7 years ago

Although the test case in comment 1 is rather contrived, this issue was seen in "real world" code, where many almost identical constructors were inlined into the global init.

Quuxplusone commented 7 years ago
Quuxplusone commented 7 years ago

And Quentin, for the register allocation bits. This seems to be a very nasty regression.

Quuxplusone commented 7 years ago
(In reply to Davide Italiano from comment #3)
> And Quentin, for the register allocation bits. This seems to be a very nasty
> regression.

As far as I can tell, the register allocator does what it had been asked to and
in particular, this does not sound like a regression. It sounds to me that the
expensive part is the update of the live-ranges in Live Debug Variable.
Attaching (and thus updating) 1500 variables to one vreg sounds like a recipe
for trouble, but I am out of my domain of expertise here.
Quuxplusone commented 7 years ago

I have a patch using LexicalScopes that looks promising. Hopefully put up a review in a day or two.