Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Debug info generated for arrays is not what GDB expects (not as good as GCC's) #29526

Open Quuxplusone opened 8 years ago

Quuxplusone commented 8 years ago
Bugzilla Link PR30553
Status REOPENED
Importance P normal
Reported by Caroline Tice (cmtice@google.com)
Reported on 2016-09-28 12:16:13 -0700
Last modified on 2020-12-10 08:05:25 -0800
Version trunk
Hardware PC Linux
CC aprantl@apple.com, ditaliano@apple.com, fedor.v.sergeev@gmail.com, florian_hahn@apple.com, friss@apple.com, grimar@accesssoftek.com, hfinkel@anl.gov, international.phantom@gmail.com, llvm-bugs@lists.llvm.org, paul_robinson@playstation.sony.com, pichet2000@gmail.com, sander.desmalen@arm.com, vsk@apple.com
Fixed by commit(s) rL323952
Attachments
Blocks PR24345
Blocked by
See also
There is a very annoying difference in the way GCC & LLVM generate debug
information for arrays.  The symptom is that when GDB is asked to print an
array-type variable that was compiled with GCC, it shows the array and it's
contents, while when asked to print the same variable for the same program,
compiled by LLVM, all you get is a pointer:

GCC version:
(gdb) print vla
$1 = {5, 7, 9}
(gdb) print vlaref
$2 = (int (&)[3]) @0x7fffffffdc30: {5, 7, 9}
(gdb) print vlaref2
$3 = (const vlareftypedef) @0x7fffffffdc30: {5, 7, 9}

LLVM version:
(gdb) print vla
$1 = 0x7fffffffdc20
(gdb) print vlaref
$2 = (int (&)[]) @0x7fffffffdc20: 0x7fffffffdc20
(gdb) print vlaref2
$3 = (vlareftypedef) @0x7fffffffdc20: 0x7fffffffdc20

LLVM can't even tell gdb the length of the array, much less its contents!

In discussing this with Eric Christopher, he said:

A simple testcase is:

int foo(int a) {
  int vla[a];
  int sum = 0;

  for (int i = 0; i < a; ++i)
    vla[i] = i;
  for (int j = 0; j < a; ++j)
    sum += vla[j];

  return sum;
}

int main (void) {
  return foo(4);
}

What's happening is that we're not adding a DW_AT_upper_bound of type
DW_FORM_expr/exprloc with the upper bound of the array.
Quuxplusone commented 6 years ago

Sander is working on this and has put up a set of patches https://reviews.llvm.org/D41698

Quuxplusone commented 6 years ago

This was fixed in https://reviews.llvm.org/rL323952

Thanks Sander!

Quuxplusone commented 6 years ago
Reopened on behalf of Carlos Enciso, who made this comment on Phabricator
post-commit:

Hi @sdesmalen!

First of all my apologies for commenting after the issue has been closed, but I
do not have an account to add a comment to the associated bugzilla.

I have found what it seems to be an issue with the current implementation.

For the given test case

  int main() {
    int size = 2;

    int vla_expr[size];
    vla_expr[1] = 1;

    return 0;
  }

and while debugging with LLDB, the following error is generated:

  (lldb) n
  Process 21014 stopped
  * thread #1, name = 'bad.out', stop reason = step over
      frame #0: 0x0000000000400502 bad.out`main at vla_2.cpp:7
     4        int vla_expr[size];
     5        vla_expr[1] = 1;
     6
  -> 7        return 0;
     8      }

  (lldb) p vla_expr
  (unsigned long) $0 = 2

  (lldb) p vla_expr[1]
  error: subscripted value is not an array, pointer, or vector

  (lldb)

Looking at the DWARF generated, there are 2 variables with the same name at the
same scope

  DW_TAG_subprogram "main"
    ...
    DW_TAG_variable "size"
    DW_TAG_variable "vla_expr"
    DW_TAG_variable "vla_expr"

I think there are 2 issues:

The compiler generated variable 'vla_expr'

- should be flagged as artificial (DW_AT_artificial)
- its name should start with double underscore to avoid conflicting with user-
defined names.

Thanks,
Carlos
Quuxplusone commented 6 years ago
I really can't reproduce this on ToT, but I hit a different issue.

(lldb) frame var
(int) size = 2
(unsigned long) __vla_expr = 2
(int [81]) vla_expr = {
  [0] = 12872
  [1] = 1
  [2] = 0
  [3] = 0
  [4] = 0
  [5] = 0
  [6] = 0
  [7] = 0
  [8] = 2
  [9] = 0
  [10] = -272631232
  [11] = 32766
  [12] = 2
  [13] = 0
  [14] = 1738604614
  [15] = 1882609363
  [16] = -272631160
  [17] = 32766
  [18] = 1970217237
  [19] = 32767
  [20] = 1970217237
  [21] = 32767
  [22] = 0
  [23] = 0
  [24] = 1
  [25] = 0
  [26] = -272630848
  [27] = 32766
  [28] = 0
  [29] = 0
  [30] = -272630841
[...]

So, the 81 elements array is definitely off.
Quuxplusone commented 6 years ago
And, FWIW, we already synthetize the variable as artificial and put two
underscores in front of via_expr

0x00000051:         TAG_variable [4]
                     AT_location( fbreg -32 )
                     AT_name( "__vla_expr" )
                     AT_type( {0x00000074} ( long unsigned int ) )
                     AT_artificial( true )

0x0000005d:         TAG_variable [5]
                     AT_location( 0x00000000
                        0x0000000100000f49 - 0x0000000100000f6f: rsi+0 )
                     AT_name( "vla_expr" )
                     AT_decl_file( "/Users/davide/work/llvm-monorepo/build/bin/blah.c" )
                     AT_decl_line( 4 )
                     AT_type( {0x0000007b} ( int[] ) )
Quuxplusone commented 6 years ago
Looking at the DWARF more closely, this is still a debug info generation bug.
For some reason, we emit an array with 0x51 elements

0x0000007b:     TAG_array_type [7] *
                 AT_type( {0x0000006d} ( int ) )

0x00000080:         TAG_subrange_type [8]
                     AT_type( {0x0000008a} ( __ARRAY_SIZE_TYPE__ ) )
                     AT_count( {0x00000051} )

0x00000089:         NULL
Quuxplusone commented 6 years ago

What does llvm-dwarfdump --debug-info=0x51 say?

Quuxplusone commented 6 years ago
davide@Davidinos-Mac-Pro ~/w/l/b/bin> ./llvm-dwarfdump --debug-info=0x51
./blah.dSYM
blah.dSYM/Contents/Resources/DWARF/blah:    file format Mach-O 64-bit x86-64

.debug_info contents:

0x00000051: DW_TAG_variable
              DW_AT_location    (DW_OP_fbreg -32)
              DW_AT_name    ("__vla_expr")
              DW_AT_type    (0x00000074 "long unsigned int")
              DW_AT_artificial  (true)
Quuxplusone commented 6 years ago

It looks like LLDB may be misinterpreting the DIE reference in the DW_AT_count attribute for a constant.

Quuxplusone commented 6 years ago
I completely agree, I tried GDB and this what I got

(gdb) r
Starting program: /home/davide/llvm-work/build/bin/blah

Breakpoint 1, main () at blah.c:2
2               int size = 2;
(gdb) n
3               int blah[size];
(gdb) n
4               blah[1] = 2;
(gdb) n
5               return 0;
(gdb) p blah
$1 = {0, 2}

So, yes, we miss the support in lldb.
Quuxplusone commented 6 years ago
This is the culprit.

Process 62787 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x00000001091cdbd4 liblldb.7.0.0.dylib`DWARFASTParserClang::ParseChildArrayInfo(this=0x00007facf23009d0, sc=0x00007ffee94cd4d8, parent_die=0x00007ffee94cd808, first_index=0x00007ffee94c95f0, element_orders=size=1, byte_stride=0x00007ffee94c95ec, bit_stride=0x00007ffee94c95e8) at DWARFASTParserClang.cpp:
3702
   3699               break;
   3700
   3701             case DW_AT_count:
-> 3702               num_elements = form_value.Unsigned();
   3703               break;
   3704
   3705             case DW_AT_bit_stride:
Target 0: (lldb) stopped.

I wonder if this has ever worked :)
Quuxplusone commented 6 years ago
The DWARF standard says (thanks to Adrian for pointing out!):

The subrange entry may have the attributes DW_AT_lower_bound and
DW_AT_upper_bound to specify, respectively, the lower and upper bound values of
the subrange. The DW_AT_upper_bound attribute may be replaced by a DW_AT_count
attribute, whose value describes the number of elements in the subrange rather
than the value of the last element. The value of each of these attributes is
determined as described in Section 2.19 on page 55.

2.19 Static and Dynamic Values of Attributes
[...]
The value of these attributes is determined based on the class as follows:
* For a constant, the value of the constant is the value of the attribute.
* For a reference, the value is a reference to another debugging information
entry. This entry may:
– describe a constant which is the attribute value,
– describe a variable which contains the attribute value, or
– contain a DW_AT_location attribute whose value is a DWARF expression which
computes the attribute value (for example, a DW_TAG_dwarf_procedure entry).
* For an exprloc, the value is interpreted as a DWARF expression; evaluation of
the expression yields the value of the attribute.

lldb currently handles only the first case correctly.
Quuxplusone commented 6 years ago
dwarfdump -F makes this more clear (the fact that this is a reference):

0x0000007b:   DW_TAG_array_type
                DW_AT_type [DW_FORM_ref4]       (0x0000006d "int")

0x00000080:     DW_TAG_subrange_type
                  DW_AT_type [DW_FORM_ref4]     (0x0000008a "__ARRAY_SIZE_TYPE__")
                  DW_AT_count [DW_FORM_ref4]    (0x00000051)
Quuxplusone commented 5 years ago

Adrian, you fixed this one, didn't you?

Quuxplusone commented 3 years ago
I investigated some debuginfo problem using VLA for an out of tree target and I
noticed:

- DEBUG_VALUE associated with VLA are not always propagated correctly across
MBB in O0.
- When VLA in a parameter (ie: int sumAll(int n, int A[n]). the debug info for
A is just a regular pointer.