Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[DebugInfo@O2] MachineSink can unsoundly extend variable location ranges #43087

Open Quuxplusone opened 4 years ago

Quuxplusone commented 4 years ago
Bugzilla Link PR44117
Status NEW
Importance P normal
Reported by Jeremy Morse (jeremy.morse.llvm@gmail.com)
Reported on 2019-11-22 07:25:46 -0800
Last modified on 2019-12-13 07:57:56 -0800
Version trunk
Hardware PC Linux
CC a.v.lapshin@mail.ru, aprantl@apple.com, chackz0x12@gmail.com, greg.bedwell@sony.com, llvm-bugs@lists.llvm.org, orlando.hyams@sony.com, paul.robinson@am.sony.com, stephen.tozer@sony.com, vsk@apple.com
Fixed by commit(s)
Attachments
Blocks PR38768
Blocked by
See also
This is a bug report to document an edge case to do with the machine-sink pass
that I don't think can be easily solved.

Here's a highly contrived reproducer, that when compiled with trunk "-O2 -g -c -
fno-unroll-loops" will sink the computation of "a & 0xFFFF" into the final
block (where there's the assign to global). It also sinks the (salvaged)
DBG_VALUE for the first value of "badgers" too.

--------8<--------
int global, global2;

int
foo(int a, int b)
{
  int floogie = a & 0xFFFF;
  int badgers = floogie + 12;

  if (a == 1234567) {
    badgers = global2; // body uninteresting, but "badgers" reassigned
    badgers ^= a;
    global2 = badgers + 1;
    if (b == 12)
      return global;
  }

  global = floogie;
  return global;
}
-------->8--------

Normally, in the end block, we would not be able to compute a location for
"badgers", because we don't know which side of the "a == 1234567" condition was
taken. The location would be empty / optimised out.

However, because the DBG_VALUE for "badgers" sinks into that end block, it
specifies the variable location as being "floogie+12", regardless of which side
of the condition was taken, which is not a true representation of the original
program.

This is actually really hard to solve with our current model. If there were no
further assignments to "badgers" on any path from the source to destination
block, then the DBG_VALUE sinking would be absolutely fine and desirable.
However, discovering whether this is true or not involves examining every block
that _might_ be on a path from the source to the destination position, which
AFAIUI is expensive. Machine sinking doesn't currently do this level of
analysis, so I haven't tried to fix it yet.

This technically applies to any pass that does any kind of sinking. Instcombine
will only sink where there isn't any control flow present though, so this isn't
a problem inscombine currently demonstrates, I think.

Time for Jeremy's pet peeve: in a more ideal world, one where the machine-
location and the instruction-location were separate, we could record an
assignment / location-change in the first block of the program, and the machine-
location in the last block, and leave it to a debug post-processor to work
these things out, when we actually do a full dataflow analysis.
Quuxplusone commented 4 years ago
Hi Jeremy, I have a question about that example.

Current behavior:

# *** IR Dump After Machine Common Subexpression Elimination ***:

  %3:gr32 = COPY $edi
  DBG_VALUE %3:gr32, $noreg, !"a", !DIExpression(), debug-location !21; test.c:0 line no:5
  %5:gr16 = COPY %3.sub_16bit:gr32, debug-location !22; test.c:7:19
  %0:gr32 = MOVZX32rr16 killed %5:gr16, debug-location !22; test.c:7:19
  DBG_VALUE %0:gr32, $noreg, !"floogie", !DIExpression(), debug-location !21; test.c:0 line no:7
  DBG_VALUE %0:gr32, $noreg, !"badgers", !DIExpression(DW_OP_plus_uconst, 12, DW_OP_stack_value), debug-locatio

# *** IR Dump After Machine code sinking ***:

bb.0.entry:
  successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%)
  liveins: $edi, $esi
  %3:gr32 = COPY $edi
  DBG_VALUE %3:gr32, $noreg, !"a", !DIExpression(), debug-location !21; test.c:0 line no:5
  ^^^^^ there is no DBG_VALUE for badgers

bb.3.if.end4:
; predecessors: %bb.0, %bb.1
  successors: %bb.4(0x80000000); %bb.4(100.00%)

  %5:gr16 = COPY %3.sub_16bit:gr32, debug-location !21; test.c:0
  %0:gr32 = MOVZX32rr16 %5:gr16, debug-location !21; test.c:0
  DBG_VALUE %0:gr32, $noreg, !"floogie", !DIExpression(), debug-location !21; test.c:0 line no:7
  DBG_VALUE %0:gr32, $noreg, !"badgers", !DIExpression(DW_OP_plus_uconst, 12, DW_OP_stack_value), debug-location !21; test.c:0 line no:8
  ^^^^ incorrect DBG_VALUE for badgers

In this example Machine sinking pass moved all DBG_VALUEs related to %0:gr32
value together with the real instructions COPY, MOVZX32rr16.

What if Machine sinking pass would move only single DBG_VALUE which directly
relates to moved instructions ?
And all other DBG_VALUEs would be left in their original places though with
salvaged values:

# *** IR Dump After Machine code sinking ***:

bb.0.entry:
  successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%)
  liveins: $edi, $esi
  %3:gr32 = COPY $edi
  DBG_VALUE %3:gr32, $noreg, !"a", !DIExpression(), debug-location !21; test.c:0 line no:5
  DBG_VALUE %3:gr32, $noreg, !"badgers", !DIExpression(DW_OP_plus_uconst, 12, DW_OP_convert, DW_ATE_unsigned_32, DW_OP_convert, DW_ATE_unsigned_16, DW_OP_stack_value), debug-location !21; test.c:0 line no:8

bb.3.if.end4:
; predecessors: %bb.0, %bb.1
  successors: %bb.4(0x80000000); %bb.4(100.00%)

  %5:gr16 = COPY %3.sub_16bit:gr32, debug-location !21; test.c:0
  %0:gr32 = MOVZX32rr16 %5:gr16, debug-location !21; test.c:0
  DBG_VALUE %0:gr32, $noreg, !"floogie", !DIExpression(), debug-location !21; test.c:0 line no:7

In that case there would be proper debug value at the place where "badgers"
defined in original code.
And also there would not be incorrect value in bb.3.if.end4

what do you think ?
Quuxplusone commented 4 years ago
Hi Alexey,

Alexey wrote:
>   ^^^^^ there is no DBG_VALUE for badgers

Good spot -- this is actually what D58238 [0] was about, however it got
reverted due to a performance regression tracked in [1]. You're absolutely
right that there should be some kind of location there. The fix of [0,1] is to
add an undef/$noreg to terminate any earlier location.

> What if Machine sinking pass would move only single DBG_VALUE which directly
> relates to moved instructions ? And all other DBG_VALUEs would be left in
> their original places though with salvaged values:
[...]
> !DIExpression(DW_OP_plus_uconst, 12, DW_OP_convert, DW_ATE_unsigned_32,
>  DW_OP_convert, DW_ATE_unsigned_16, DW_OP_stack_value)
[...]
> In that case there would be proper debug value at the place where "badgers"
> defined in original code. And also there would not be incorrect value in
> bb.3.if.end4

Just to confirm I understand what you're saying: you've added DW_OP_convert to
the expression there, meaning we should recover the effect of the sunk
MOVZX32rr16 and encode it in the DIExpression, yes?

That would definitely be desirable, and that's what happens when something gets
moved / deleted in LLVM-IR. However, after instruction selection this becomes
much more difficult as there are literally thousands of machine instructions,
many of which have effects that can't be modelled, and which have many
different forms generated from templates. It's too much of a burden (on
software engineers) to encode all of this information to happen after isel.

IMO, there are other ways we could recover this information, such as re-
analysing the LLVM-IR when a location goes missing in the MachineFunction, but
that's out of scope for this ticket :)

[0] https://reviews.llvm.org/D58238
[1] https://bugs.llvm.org/show_bug.cgi?id=43855
Quuxplusone commented 4 years ago
thank you,

>Just to confirm I understand what you're saying:
>you've added DW_OP_convert to the expression there,
>meaning we should recover the effect of the sunk
>MOVZX32rr16 and encode it in the DIExpression, yes?

correct. Though this is the second point. I also agree with following :

>That would definitely be desirable, and that's what happens
>when something gets moved / deleted in LLVM-IR. However, after
>instruction selection this becomes much more difficult as
>there are literally thousands of machine instructions,
>many of which have effects that can't be modelled, and which
>have many different forms generated from templates. It's too much
>of a burden (on software engineers) to encode all of this
>information to happen after isel.

But, my suggestion is not about creating SalvageValue function based on MIR.
That would be nice to have, as well as other heuristics which would allow to
calculate proper value(like re-analysing the LLVM-IR when a location goes
missing in the MachineFunction).

My main point is that we probably should not sink any DBG_VALUE except
very first one. As far as I understand [0] and [1] would move/clone not only
first DBG_VALUE but in some cases other DBG_VALUE related to sunk value.
I think, when MachineSinking pass see following code :

  DBG_VALUE %3:gr32, $noreg, !"a", !DIExpression(), debug-location !21; test.c:0 line no:5
  %5:gr16 = COPY %3.sub_16bit:gr32, debug-location !22; test.c:7:19
  %0:gr32 = MOVZX32rr16 killed %5:gr16, debug-location !22; test.c:7:19
  DBG_VALUE %0:gr32, $noreg, !"floogie", !DIExpression(), debug-location !21; test.c:0 line no:7
  DBG_VALUE %0:gr32, $noreg, !"badgers", !DIExpression(DW_OP_plus_uconst, 12, DW_OP_stack_value), debug-locatio

It should move only first DBG_VALUE("floogie") and leave second
value("badgers") in place. Additionally, if it can salvage "badgers" then it
would change its expression accordingly. Otherwise, it would set "badgers" to
undef.

The reason for this is that only the first DBG_VALUE relates to moved
value(%0:gr32). Other DBG_VALUEs relate to different values that were optimized
out earlier and salvaged on the sunk value(%0:gr32) basis. I think it would be
generally incorrect to move such connected values. Machine sinking pass knows
that it is safe to move value for "floogie". But whether "badgers" could be
sunk or not - is unknown. The problem described in current PR[2] exactly such a
case - there was sunk DBG_VALUE for "badgers" while it could not be sunk
because of changes done on another control flow path.

Do you think that solution could work: "When sinking instruction, move only
first corresponding DBG_VALUE, leave all others in place. For all left
DBG_BALUES - either salvage value either set it to undef"?

I think that there should not be performance regression in that case since
DBG_VALUE would not be cloned. And there would not be incorrectly reported
DBG_VALUE.

[0] https://reviews.llvm.org/D58238
[1] https://reviews.llvm.org/D70672
[2] https://bugs.llvm.org/show_bug.cgi?id=44117
Quuxplusone commented 4 years ago
> My main point is that we probably should not sink any DBG_VALUE except
> very first one.

Ah, I missed that sorry. I don't think it's a direction we should pursue: we
don't currently track whether or not a DBG_VALUE refers to something that got
optimised or not, there are code sequences that can be salvaged / common-
subexpression-eliminated without changing the DIExpression, and would look like
an un-touched DBG_VALUE.

Plus, if there weren't the other interfering assignment to "badgers", I think
we _would_ want to sink all the DBG_VALUEs. In that case, the sinking
optimisation is only shortening the range that the location is defined over.

> I think it would be generally incorrect to move such connected values.
[...]
> Do you think that solution could work: "When sinking instruction, move
> only first corresponding DBG_VALUE, leave all others in place. For all
> left DBG_BALUES - either salvage value either set it to undef"?

I think this hinges on being able to distinguish between the "original"
assignment and "connected values" -- unfortunately I don't think we can safely
recover that information this far into compilation, variable locations can have
been hoisted / CSEd / sunk to such an extent that _no_ dbg.value reflects the
"original" assignment.

~

In addition, I think my example might be slightly misleading, because the
salvaging of the first assignment to "badgers" isn't really necessary for the
fault I'm describing, it's just part of the test I had to hand. The key
requirement is that there are two "dead" variable assignments in different
blocks. Because they're dead, they don't receive a PHI instruction that would
get its own dbg.value. We then enter this situation where sinking a DBG_VALUE
would change the instructions that it dominates, and where it's expensive to
identify when this happens.

(Note that the original source doesn't necessarily need to have "dead"
assignments -- various optimisations like dead store elimination might make a
use dead along the way).
Quuxplusone commented 4 years ago
>Ah, I missed that sorry. I don't think it's a direction we should pursue: we
>don't currently track whether or not a DBG_VALUE refers to something that got
> optimised or not, there are code sequences that can be salvaged / common-
>subexpression-eliminated without changing the DIExpression, and would look
>like an un-touched DBG_VALUE.

Agreed that using DIExpresion as the marker, whether DBG_VALUE refers to
something that got optimized or not, is not suitable. Though this probably does
not make the idea of separation DBG_VALUE useless. When sinking instruction, we
already _know_ that it is safe to sink value created by this instruction, and
we _do_not_know_ whether it is safe to sink all other connected values. Thus it
would probably be useful to understand which value directly defined by
instruction and which is not so that it would be possible to handle them
differently.

Instead of analyzing DIExpresion, there could probably be enforced the rule
that the first DBG_VALUE is that which describes value defined by the
instruction. That seems logical since if other DBG_VALUES use the value created
by that instruction, they could not appear before the value is created.

Probably we could use another way of separating values if this one does not
work(adding link to DBG_VALUE into the instruction?)...

>Plus, if there weren't the other interfering assignment to "badgers", I think
> we _would_ want to sink all the DBG_VALUEs. In that case, the sinking
> optimisation is only shortening the range that the location is defined over.

Right, but it is not known whether the other interfering assignment exists.
Also, I think that if we could salvage value then we do not wish to sink all
DBG_VALUES. for example :

x = 10
...
x++
printf("hello");
x--;

mov r1, 10                          // x = 10
dbg.value r1, "x", DIExpression()
mov r2, r1
dbg.value r2, "x", DIExpression()
....
dbg.value r2, "x", DIExpression(DW_OP_plus_uconst, 1, DW_OP_stack_value)  // x++
....
printf("hello")
dbg.value r2, "x", DIExpression() // x--

let`s assume that "mov r2, r1" should be sinked:

-----
mov r1, 10          x = 10
dbg.value r1, "x", DIExpression()
....
dbg.value r2, "x", DIExpression(DW_OP_plus_uconst, 1, DW_OP_stack_value)  // x++
^^^^^^^^^ it is better to not sink it down
^^^^^^^^^ but make it "dbg.value r1, x, DIExpression(DW_OP_plus_uconst, 1,
DW_OP_stack_value)"
....
printf("hello")
dbg.value r2, "x", DIExpression() // x--
^^^^^^^^^ it is better to not sink it down
^^^^^^^^^ but make it "dbg.value r1, x, DIExpression()"
....
mov r2, r1
dbg.value r2, "x", DIExpression()

Though we sank copying x value from r1 to r2, it is better to see ++ and -- at
their original places still.
So that, when we stop debugger at "printf" we see x++ value; (for the example
from this PR it would mean that we would see the value of "badgers" closer to
its source code location)

thus we want to sink DBG_VALUE only if :

1. we could not salvage it(in above example it is when we could not replace
dbg.value r2, "x" with dbg.value r1, "x")

2. it is last value for that variable. i.e. instead of sinking all three
DBG_VALUE:

dbg.value r2, "x", DIExpression()
dbg.value r2, "x", DIExpression(DW_OP_plus_uconst, 1, DW_OP_stack_value)  // x++
dbg.value r2, "x", DIExpression() // x--

sink only DBG_VALUE related to last effective value for variable "x":
dbg.value r2, "x", DIExpression()

3. we prove that after sinking order of assignments was not changed.

   That thing would generally require to perform full dataflow analysis for debug values. Which could be expensive. Instead of full dataflow analysis we could probably try to use chipper alternatives:
   a) do not do that analysis for original assignment value(safeness is already proved by optimization).
   b) use simple fast heuristic for connected values, like instead of checking data flow just check that there is no other DBG_VALUEs for concrete variable.(that would work for example from this PR)
   c) if not #a and #b - then do not sink value and put undef into it.

>I think this hinges on being able to distinguish between the "original"
> assignment and "connected values" -- unfortunately I don't think we can safely
>recover that information this far into compilation, variable locations can have
>been hoisted / CSEd / sunk to such an extent that _no_ dbg.value reflects the
>"original" assignment.

if we would take the rule that first DBG_VALUE shows original assignment then
we could probably require DBG_VALUE undef to be first in that case.

>In addition, I think my example might be slightly misleading, because the
>salvaging of the first assignment to "badgers" isn't really necessary for the
>fault I'm describing, it's just part of the test I had to hand. The key
>requirement is that there are two "dead" variable assignments in different
>blocks. Because they're dead, they don't receive a PHI instruction that would
>get its own dbg.value. We then enter this situation where sinking a DBG_VALUE
>would change the instructions that it dominates, and where it's expensive to
>identify when this happens.

>(Note that the original source doesn't necessarily need to have "dead"
>assignments -- various optimisations like dead store elimination might make a
>use dead along the way).

yeah. it is the example when we need to have expensive analysis to understand
how DBG_VALUE should be handled. It seems that separating values to "original
assinment" and "connected values" could probably help here.
if "dead variable assignment" is "original assignment" - then we could safely
sink DBG_VALUE with the instruction.
if "dead variable assignment" is a "connected value" - then we leave it at
original place and either salvage it either set it to undef.
In the result there would not be wrong value propagated.