llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.86k stars 11.91k forks source link

DW_AT_call_value location may be clobbered by callee #42688

Open vedantk opened 5 years ago

vedantk commented 5 years ago
Bugzilla Link 43343
Version trunk
OS All
CC @adrian-prantl,@dstenb,@dwblaikie,@fredriss,@JDevlieghere,@jmorse,@walkerkd,@pogo59,@vedantk

Extended Description

In https://reviews.llvm.org/D67556, David Stenberg gives an example of how the location of a parameter in a caller can escape and be clobbered by the callee. In this case, the debugger may print a misleading description for the parameter when stopped in the callee.

The setup is roughly:

void caller() {
  int local = 0;
  escape(&local);
  callee(local);
}

void callee(int p) {
  cause_arg1_to_be_optimized_out();
  *escaped_ptr_to_local = 1;
  // print 'p' in the debugger: will show '1' instead of '0'.
}
vedantk commented 4 years ago

Thanks! I've addressed that in ba71ca37.

009b8ea0-6754-41bc-becf-d830f82bdd65 commented 4 years ago

When standing on bfbbf0aba81a84da8b53d4d159d080e77ad8ee70 I encountered a case where this produces incorrect call site values.

$ cat foo.c
#include <stdint.h>

uint32_t a;

extern uint32_t b(uint64_t);
extern uint32_t f(void);
extern uint32_t i(uint32_t *);
extern void h(uint32_t, uint32_t);
extern void j(uint32_t, uint32_t);

int main() {
  uint32_t d;
  uint32_t e = 0, k = 0, l = 0;
  d = f();
  uint32_t g[1];
  for (;; e = b(e), k = b(k), l = b(l))
    for (; a;) {
      h(a + 1, a & i(g));
      j(1 ^ d, d);
    }
  return 0;
}

$ cat helpers.c
#include <stdint.h>

uint64_t side_effect;
uint32_t b(uint64_t val) {
  side_effect = val;
  __asm volatile("" : : : "%rdi");
  return 123;
}

uint32_t f(void) { return 123; }
uint32_t i(uint32_t *p) { return 456; }

void h(uint32_t a, uint32_t b) {}
void j(uint32_t a, uint32_t b) {}

Compiled using:

clang -fno-inline -g -Os -g -Xclang -femit-debug-entry-values foo.c helpers.c

gives call site entries for two of the three calls to b() where the parameter is described using a memory location:

0x00000171:     DW_TAG_GNU_call_site
                DW_AT_abstract_origin   (0x00000076 "b")
                DW_AT_low_pc    (0x0000000000000069)

0x0000017e:       DW_TAG_GNU_call_site_parameter
                  DW_AT_location    (DW_OP_reg5 RDI)
                  DW_AT_GNU_call_site_value (DW_OP_breg13 R13+0)

0x00000184:       NULL

0x00000185:     DW_TAG_GNU_call_site
                DW_AT_abstract_origin   (0x00000076 "b")
                DW_AT_low_pc    (0x0000000000000075)

0x00000192:       DW_TAG_GNU_call_site_parameter
                  DW_AT_location    (DW_OP_reg5 RDI)
                  DW_AT_GNU_call_site_value (DW_OP_fbreg +16, DW_OP_deref)

0x00000199:       NULL

0x0000019a:     DW_TAG_GNU_call_site
                DW_AT_abstract_origin   (0x00000076 "b")
                DW_AT_low_pc    (0x0000000000000081)

0x000001a7:       DW_TAG_GNU_call_site_parameter
                  DW_AT_location    (DW_OP_reg5 RDI)
                  DW_AT_GNU_call_site_value (DW_OP_fbreg +12, DW_OP_deref)

When printing the parameter using the entry value, it is printed incorrectly in the calls that are described using the memory location:

$ echo -e 'b helpers.c:7\nrun\nc\nc\nc\nc\nc' | gdb a.out | grep Breakpoint
(gdb) Breakpoint 1 at 0x20116f: file helpers.c, line 7.
Breakpoint 1, b (val=val@entry=0) at helpers.c:7
Breakpoint 1, b (val=val@entry=140733193388032) at helpers.c:7
Breakpoint 1, b (val=val@entry=0) at helpers.c:7
Breakpoint 1, b (val=val@entry=123) at helpers.c:7
Breakpoint 1, b (val=val@entry=140733193388155) at helpers.c:7
Breakpoint 1, b (val=val@entry=528280977531) at helpers.c:7

I think that is due to 32 bits being read in the program:

      movl    16(%rsp), %edi          # 4-byte Reload
      .loc    1 16 25                 # foo.c:16:25
      callq   b

but a DW_OP_deref is used for the call site, which reads 64 bits on x86-64 (the size of a pointer), so the higher 32 bits are filled with garbage stack data. Should perhaps a DW_OP_deref_size be emitted in this case?

vedantk commented 4 years ago

I’d like to see how far we can get without introducing a new tag. After we try a few things out, if there’s enough buy-in for a dwarf extension I think it’d make sense to try and standardize it.

llvmbot commented 4 years ago

Hm. Instead of doing a conservative capture tracking analysis, we could just invent a new call site parameter tag to represent "best guesses". Say, DW_TAG_guessed_call_site_parameter.

Interesting idea. Eventually, if we decide to go that way, do you plan to propose such symbol to DWARF committee or we will use it only in LLVM/LLDB ecosystem?

vedantk commented 4 years ago

OTOH (as Adrian pointed out offline) guessed values may not really be all that useful, e.g. a guessed bool may be pretty confounding.

vedantk commented 4 years ago

Hm. Instead of doing a conservative capture tracking analysis, we could just invent a new call site parameter tag to represent "best guesses". Say, DW_TAG_guessed_call_site_parameter.

This would give us a large increase in available call site parameter DIEs. We can teach the debugger to print guessed$varname, instead of $varname, when a guessed location is used to evaluate an entry value.

If a debugger doesn't support the new tag, that's fine: it will not regress or mistakenly print the wrong value.

vedantk commented 4 years ago

Here's an initial patch that just deals with the simple case -- spill slots: https://reviews.llvm.org/D70254

I've reworked the escape analysis to use CaptureTracking. However, with the analysis in place, there is only a 0.5% increase in the number of call site parameter DIEs in an LTO x86_64 build of the xnu kernel. This is in start contrast to the 5% increase we get by just describing spill slots.

The explanation for this is not that most mem-operand arguments live in spill slots. Actually, most (73% in the xnu build) do not. What I see is that the escape analysis just doesn't have enough information to do a better job.

The vast majority of the mem-operand arguments we fail to describe are escaped by calls. Because llvm cannot deduce that these calls are nocapture, the CaptureTracking analysis gives up and says that the memory escapes. If there were a way to make nocapture attribution more aggressive, that would help.

Here are my raw stats about mem-operand arguments handled by describeLoadedValue:

$ grep describeLoadedValue ~/tmp/xnu-entryval-build-withLTO.log  | histo
Cumul %         Total %         Count
0.55%           0.55%            236 describeLoadedValue/psv-escaped
3.37%           2.82%           1214 describeLoadedValue/non-escaping
27.08%          23.71%          10206 describeLoadedValue/spilled
100.00%         72.92%          31390 describeLoadedValue/mem-escaped

psv-escaped: # of PseudoSourceValues which may-alias a high-level IR Value spilled: # of PseudoSourceValues llvm can describe (these are simple spills)

mem-escaped: # of Values the CaptureTracking analysis cannot prove do not escape non-escaping: # of Values the CaptureTracking analysis can prove do not escape

vedantk commented 4 years ago

Rough proof-of-concept Just a heads-up, I've attached a rough prototype of an analysis that should allow llvm to emit call site info for arguments that live on the stack. I plan to clean it up and start a real review soon.

vedantk commented 5 years ago

In https://reviews.llvm.org/D67717, Nikola Prica provides a simple and conservative solution. This seems like a sensible fix for now, but I'm concerned this might be too restrictive in the long run.