Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

libclang (via Python bindings), is_definition() is different for structs on x84_64 and aarch64 #19758

Open Quuxplusone opened 10 years ago

Quuxplusone commented 10 years ago
Bugzilla Link PR19759
Status NEW
Importance P normal
Reported by David Abdurachmanov (david.abdurachmanov@gmail.com)
Reported on 2014-05-15 16:01:27 -0700
Last modified on 2014-05-17 00:13:04 -0700
Version 3.4
Hardware Other other
CC llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
We have a Python script using libclang. It walks over AST to collect
information about structures. I am porting code to aarch64 (ARMv8) and found
that script does not produce any content. This is because is_definition()
return value is different between x86_64 and aarch64.

Example:

/usr/include/time.h

### Fedora 19, AArch64 (ARMv8) ###

[2014-05-15 16:03:42,980] DEBUG: Child: <clang.cindex.Cursor object at
0x17e993b0> | timespec | timespec, kind: CursorKind.STRUCT_DECL, is_definition:
False, location: <SourceLocati
on file '/usr/include/time.h', line 120, column 8>

    118 /* POSIX.1b structure for a time value.  This is like a `struct timeval' but
    119    has nanoseconds instead of microseconds.  */
    120 struct timespec
    121   {
    122     __time_t tv_sec;    /* Seconds.  */
    123     __syscall_slong_t tv_nsec;  /* Nanoseconds.  */
    124   };

### RHEL6, x86_64 ###

[2014-05-15 21:58:46,767] DEBUG: Child: <clang.cindex.Cursor object at
0x2468560> | timespec | timespec, kind: CursorKind.STRUCT_DECL, is_definition:
True, location: <SourceLocation file '/usr/include/time.h', line 120, column 8>
[2014-05-15 21:58:46,767] DEBUG: Found struct/class/template definition:
timespec
[2014-05-15 21:58:46,767] DEBUG: Skipping since it is an external of this
package: timespec

118 /* POSIX.1b structure for a time value.  This is like a `struct timeval' but
119    has nanoseconds instead of microseconds.  */
120 struct timespec
121   {
122     __time_t tv_sec;            /* Seconds.  */
123     long int tv_nsec;           /* Nanoseconds.  */
124   };
125

But with clang -Xclang -ast-dump -fsyntax-only

|-RecordDecl 0x85fb030 </usr/include/time.h:120:1, line:124:3> struct timespec
definition
| |-FieldDecl 0x85fb100 <line:122:5, col:14> tv_sec '__time_t':'long'
| `-FieldDecl 0x85fb180 <line:123:5, col:23> tv_nsec '__syscall_slong_t':'long'

|-RecordDecl 0x39e1e70 </usr/include/time.h:120:1, line:124:3> struct timespec
definition
| |-FieldDecl 0x39e1f40 <line:122:5, col:14> tv_sec '__time_t':'long'
| `-FieldDecl 0x39e1fa0 <line:123:5, col:14> tv_nsec 'long'

On both machines it says "struct timespec definition".

Something like that is good enough to reproduce:

$ cat check.py
import sys
import clang.cindex

def find_typerefs(node):
    for child in node.get_children():
      print("{0} {1} {2} {3}".format(child.displayname, child.kind, child.is_definition(), child.location))
      find_typerefs(child)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
print 'Translation unit:', tu.spelling
find_typerefs(tu.cursor)

$ python check.py /usr/include/time.h

# x86_64 #

timespec CursorKind.STRUCT_DECL True <SourceLocation file
'/usr/include/time.h', line 120, column 8>
tv_sec CursorKind.FIELD_DECL True <SourceLocation file '/usr/include/time.h',
line 122, column 14>
__time_t CursorKind.TYPE_REF False <SourceLocation file '/usr/include/time.h',
line 122, column 5>
tv_nsec CursorKind.FIELD_DECL True <SourceLocation file '/usr/include/time.h',
line 123, column 14>

# aarch64 #

timespec CursorKind.STRUCT_DECL False <SourceLocation file
'/usr/include/time.h', line 120, column 8>
tv_sec CursorKind.FIELD_DECL True <SourceLocation file '/usr/include/time.h',
line 122, column 14>
__time_t CursorKind.TYPE_REF False <SourceLocation file '/usr/include/time.h',
line 122, column 5>
tv_nsec CursorKind.FIELD_DECL True <SourceLocation file '/usr/include/time.h',
line 123, column 23>
__syscall_slong_t CursorKind.TYPE_REF False <SourceLocation file
'/usr/include/time.h', line 123, column 5>

Is there any issues with libclang or/and python binding on aarch64?
Quuxplusone commented 10 years ago
Currently looks like the following is failing:

4708    unsigned clang_isCursorDefinition(CXCursor C) {
4709      if (!clang_isDeclaration(C.kind))
4710        return 0;
4711
4712      return clang_getCursorDefinition(C) == C;
4713    }

Line 4712.

Breakpoint 1, clang_isCursorDefinition (C=...)
    at /home/david/new-arch/test/BUILD/fc19_aarch64_gcc490/external/llvm/3.4-cms2/llvm-3.4-6800b6d2afc/tools/clang/tools/libclang/CIndex.cpp:4709
4709      if (!clang_isDeclaration(C.kind))
(gdb) p C
$1 = {kind = CXCursor_ClassDecl, xdata = 0, data = {0x7fb39e60e0, 0x0,
0x7fb000cfb0}}
(gdb) p clang_getCString(clang_getCursorDisplayName(C))
$2 = 0x9a5cd0 "RunNumber"
(gdb) p C.
data   kind   xdata
(gdb) set $foo = clang_getCursorDefinition(C)
(gdb) p $foo
$3 = {kind = CXCursor_ClassDecl, xdata = 0, data = {0x7fb39e60e0, 0x1,
0x7fb000cfb0}}

But printing location of both, prints the same file, line and column.

They are not equal because: C.data[1] != clang_getCursorDefinition(C).data[1]
Quuxplusone commented 10 years ago
I moved to trunk for LLVM and Clang. Still the same. Smaller example below.

$ cat my.h
struct timespec
{
  int tv_sec;
  int tv_nsec;
};

$ cat check.py
import sys
import clang.cindex

def find_all(node):
    for child in node.get_children():
      print("displayname: {0}, kind: {1}, is_definition: {2},  location:{3}".format(child.displayname, child.kind, child.is_definition(), child.location))
      if child.get_definition() is not None:
        print(">> get_definition().location: {0}".format(child.get_definition().location))
      find_all(child)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
print 'Translation unit:', tu.spelling
find_all(tu.cursor)

## Fedora 20 / x86_64

$ python check.py my.h
Translation unit: my.h
displayname: __int128_t, kind: CursorKind.TYPEDEF_DECL, is_definition: True,
location:<SourceLocation file None, line 0, column 0>
>> get_definition().location: <SourceLocation file None, line 0, column 0>
displayname: __uint128_t, kind: CursorKind.TYPEDEF_DECL, is_definition: True,
location:<SourceLocation file None, line 0, column 0>
>> get_definition().location: <SourceLocation file None, line 0, column 0>
displayname: __builtin_va_list, kind: CursorKind.TYPEDEF_DECL, is_definition:
True,  location:<SourceLocation file None, line 0, column 0>
>> get_definition().location: <SourceLocation file None, line 0, column 0>
displayname: __va_list_tag, kind: CursorKind.TYPE_REF, is_definition: False,
location:<SourceLocation file None, line 0, column 0>
>> get_definition().location: <SourceLocation file None, line 0, column 0>
displayname: timespec, kind: CursorKind.STRUCT_DECL, is_definition: True,
location:<SourceLocation file 'my.h', line 1, column 8>
>> get_definition().location: <SourceLocation file 'my.h', line 1, column 8>
displayname: tv_sec, kind: CursorKind.FIELD_DECL, is_definition: True,
location:<SourceLocation file 'my.h', line 3, column 7>
>> get_definition().location: <SourceLocation file 'my.h', line 3, column 7>
displayname: tv_nsec, kind: CursorKind.FIELD_DECL, is_definition: True,
location:<SourceLocation file 'my.h', line 4, column 7>
>> get_definition().location: <SourceLocation file 'my.h', line 4, column 7>

## Fedora 19 / AArch64

$ python check.py my.h
Translation unit: my.h
displayname: timespec, kind: CursorKind.STRUCT_DECL, is_definition: False,
location:<SourceLocation file 'my.h', line 1, column 8>
>> get_definition().location: <SourceLocation file 'my.h', line 1, column 8>
displayname: tv_sec, kind: CursorKind.FIELD_DECL, is_definition: True,
location:<SourceLocation file 'my.h', line 3, column 7>
>> get_definition().location: <SourceLocation file 'my.h', line 3, column 7>
displayname: tv_nsec, kind: CursorKind.FIELD_DECL, is_definition: True,
location:<SourceLocation file 'my.h', line 4, column 7>
>> get_definition().location: <SourceLocation file 'my.h', line 4, column 7>