sighingnow / libclang

(Unofficial) Release libclang (clang.cindex) on pypi.
https://pypi.org/project/libclang
Other
81 stars 21 forks source link

Segmentation fault when calling clang.cindex.Type.get_size() on cursor kind CursorKind.DECL_REF_EXPR #35

Open plmwd opened 1 year ago

plmwd commented 1 year ago

Description

I was testing out this library with some dummy code you can find here, and I came across a segmentation fault when calling clang.cindex.Type.get_size() on a cursor node of kind CursorKind.DECL_REF_EXPR. It doesn't really make sense to get the size, but I think it should just return 0 or None.

Clang Version

Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: arm64-apple-darwin21.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

To Reproduce

git clone https://github.com/plmwd/libclang-test.git
cd libclang-test
git checkout 65d236ee200eff80c3094e76e93b0358a1490636

python3.10 -m venv .venv
pip install -r requirements.txt
python libclang-test
sighingnow commented 1 year ago

Cannot reproduce with libclang-14.0.6 on x86_64 MacOS. Will try to reproduce the issue on arm64 MacOS next Monday.

/Users/linzhu.ht/tmp/libclang-test
{'args': ['/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++',
          '--driver-mode=g++',
          '-I/Users/linzhu.ht/tmp/libclang-test/include',
          '-isysroot',
          '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.0.sdk',
          '-o',
          'CMakeFiles/test.dir/src/test.cpp.o',
          '-c',
          '/Users/linzhu.ht/tmp/libclang-test/src/test.cpp'],
 'dir': '/Users/linzhu.ht/tmp/libclang-test/build',
 'filename': '/Users/linzhu.ht/tmp/libclang-test/src/test.cpp'}
Found classes or structs
[<File: src/test.h>, <File: /Users/linzhu.ht/tmp/libclang-test/include/some.hpp>]
Foo 20 bytes
   name=a, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=b, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=, type=, size=-1, kind=CursorKind.CXX_ACCESS_SPEC_DECL
   name=c, type=char[10], size=10, kind=CursorKind.FIELD_DECL
Bazz 32 bytes
   name=p, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=e, type=char, size=1, kind=CursorKind.FIELD_DECL
   name=n, type=void *, size=8, kind=CursorKind.FIELD_DECL
   name=i, type=long, size=8, kind=CursorKind.FIELD_DECL
   name=s, type=unsigned long long, size=8, kind=CursorKind.FIELD_DECL
Bar 80 bytes
   name=, type=, size=-1, kind=CursorKind.CXX_ACCESS_SPEC_DECL
   name=f1, type=Foo, size=20, kind=CursorKind.FIELD_DECL
   name=f2, type=Foo, size=20, kind=CursorKind.FIELD_DECL
   name=f, type=Bazz, size=32, kind=CursorKind.FIELD_DECL
   name=bb, type=int, size=4, kind=CursorKind.FIELD_DECL
JhnW commented 1 year ago

I checked on Ubuntu 20 + liblclang 14.0.1 - it works fine. I remember similar bugs while working on Devan (https://github.com/JhnW/devana - due to the strong dependence on libclang, when I test devan, I test libclang quite thoroughly). It was back in version 12. I didn't report any bugs then because I didn't know how well sighingnow is running the project yet :) In any case, these types of bugs, if not already fixed, tended to show up in completely useless scenarios.

sighingnow commented 1 year ago

I checked on Ubuntu 20 + liblclang 14.0.1 - it works fine.

Have you tried Ubuntu 20 + libclang 14.0.6 (the latest released version on Pypi)?

I didn't know how well sighingnow is running the project yet :)

I'm keeping upload prebuilt wheels to pypi for LLVM releases.

In any case, these types of bugs, if not already fixed, tended to show up in completely useless scenarios.

In any case it shouldn't segmentation fault. Will investigate the issue on M1 macos. Thanks for your reporting.

JhnW commented 1 year ago

Have you tried Ubuntu 20 + libclang 14.0.6 (the latest released version on Pypi)?

I checked. Everything works. I can also check the Windows 10 version in my free time if you want.

I'm keeping upload prebuilt wheels to pypi for LLVM releases.

Now I know :) Just when I started using this module I wasn't sure about the support. Anyway, you doing good job.

plmwd commented 1 year ago

I've been messing around on the latest commits on main. I had the segmentation fault on commit 65d236ee200eff80c3094e76e93b0358a1490636 when I was calling get_size() on every type of cursor.

I believe this is from the latest commit, not 65d236ee200eff80c3094e76e93b0358a1490636.

Cannot reproduce with libclang-14.0.6 on x86_64 MacOS. Will try to reproduce the issue on arm64 MacOS next Monday.

/Users/linzhu.ht/tmp/libclang-test
{'args': ['/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++',
          '--driver-mode=g++',
          '-I/Users/linzhu.ht/tmp/libclang-test/include',
          '-isysroot',
          '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.0.sdk',
          '-o',
          'CMakeFiles/test.dir/src/test.cpp.o',
          '-c',
          '/Users/linzhu.ht/tmp/libclang-test/src/test.cpp'],
 'dir': '/Users/linzhu.ht/tmp/libclang-test/build',
 'filename': '/Users/linzhu.ht/tmp/libclang-test/src/test.cpp'}
Found classes or structs
[<File: src/test.h>, <File: /Users/linzhu.ht/tmp/libclang-test/include/some.hpp>]
Foo 20 bytes
   name=a, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=b, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=, type=, size=-1, kind=CursorKind.CXX_ACCESS_SPEC_DECL
   name=c, type=char[10], size=10, kind=CursorKind.FIELD_DECL
Bazz 32 bytes
   name=p, type=int, size=4, kind=CursorKind.FIELD_DECL
   name=e, type=char, size=1, kind=CursorKind.FIELD_DECL
   name=n, type=void *, size=8, kind=CursorKind.FIELD_DECL
   name=i, type=long, size=8, kind=CursorKind.FIELD_DECL
   name=s, type=unsigned long long, size=8, kind=CursorKind.FIELD_DECL
Bar 80 bytes
   name=, type=, size=-1, kind=CursorKind.CXX_ACCESS_SPEC_DECL
   name=f1, type=Foo, size=20, kind=CursorKind.FIELD_DECL
   name=f2, type=Foo, size=20, kind=CursorKind.FIELD_DECL
   name=f, type=Bazz, size=32, kind=CursorKind.FIELD_DECL
   name=bb, type=int, size=4, kind=CursorKind.FIELD_DECL

This is what I get:

❯ python libclang_test/
/Users/paulwood/workspace/libclang-test
 displayname=src/test.cpp, kind=CursorKind.TRANSLATION_UNIT, type=, size=-1
   displayname=Foo, kind=CursorKind.STRUCT_DECL, type=Foo, size=20
     displayname=a, kind=CursorKind.FIELD_DECL, type=int, size=4
     displayname=b, kind=CursorKind.FIELD_DECL, type=int, size=4
     displayname=c, kind=CursorKind.FIELD_DECL, type=char[10], size=10
       displayname=, kind=CursorKind.INTEGER_LITERAL, type=int, size=4
   displayname=Bar, kind=CursorKind.STRUCT_DECL, type=Bar, size=44
     displayname=f1, kind=CursorKind.FIELD_DECL, type=Foo, size=20
       displayname=struct Foo, kind=CursorKind.TYPE_REF, type=Foo, size=20
     displayname=f2, kind=CursorKind.FIELD_DECL, type=Foo, size=20
       displayname=struct Foo, kind=CursorKind.TYPE_REF, type=Foo, size=20
     displayname=bb, kind=CursorKind.FIELD_DECL, type=int, size=4
   displayname=main(int, char **), kind=CursorKind.FUNCTION_DECL, type=int (int, char **), size=1
     displayname=argc, kind=CursorKind.PARM_DECL, type=int, size=4
     displayname=argv, kind=CursorKind.PARM_DECL, type=char *[], size=-2
     displayname=, kind=CursorKind.COMPOUND_STMT, type=, size=-1
       displayname=, kind=CursorKind.UNEXPOSED_EXPR, type=<dependent type>, size=-3
fish: Job 1, 'python libclang_test/' terminated by signal SIGSEGV (Address boundary error)

After removing all get_size() calls from the same commit:

❯ python libclang_test/
/Users/paulwood/workspace/libclang-test
 displayname=src/test.cpp, kind=CursorKind.TRANSLATION_UNIT, type=
   displayname=Foo, kind=CursorKind.STRUCT_DECL, type=Foo
     displayname=a, kind=CursorKind.FIELD_DECL, type=int
     displayname=b, kind=CursorKind.FIELD_DECL, type=int
     displayname=c, kind=CursorKind.FIELD_DECL, type=char[10]
       displayname=, kind=CursorKind.INTEGER_LITERAL, type=int
   displayname=Bar, kind=CursorKind.STRUCT_DECL, type=Bar
     displayname=f1, kind=CursorKind.FIELD_DECL, type=Foo
       displayname=struct Foo, kind=CursorKind.TYPE_REF, type=Foo
     displayname=f2, kind=CursorKind.FIELD_DECL, type=Foo
       displayname=struct Foo, kind=CursorKind.TYPE_REF, type=Foo
     displayname=bb, kind=CursorKind.FIELD_DECL, type=int
   displayname=main(int, char **), kind=CursorKind.FUNCTION_DECL, type=int (int, char **)
     displayname=argc, kind=CursorKind.PARM_DECL, type=int
     displayname=argv, kind=CursorKind.PARM_DECL, type=char *[]
     displayname=, kind=CursorKind.COMPOUND_STMT, type=
       displayname=, kind=CursorKind.UNEXPOSED_EXPR, type=<dependent type>
         displayname=, kind=CursorKind.DECL_REF_EXPR, type=<overloaded function type>
           displayname=printf, kind=CursorKind.OVERLOADED_DECL_REF, type=
         displayname="hello world!\n", kind=CursorKind.STRING_LITERAL, type=const char[14]
       displayname=, kind=CursorKind.RETURN_STMT, type=
         displayname=, kind=CursorKind.INTEGER_LITERAL, type=int
sighingnow commented 1 year ago

I had the segmentation fault on commit 65d236ee200eff80c3094e76e93b0358a1490636 when I was calling get_size() on every type of cursor.

Thanks for the information. I can produce the failure now.

sighingnow commented 1 year ago

I can confirm that the segmentation fault is a bug of this prebuilt wheel itself. The code in 65d236ee200eff80c3094e76e93b0358a1490636 works quiet wheel with brew installed LLVM.

For a workaround, you could use

export LIBCLANG_LIBRARY_PATH=$(brew --prefix llvm)/lib

to load the libclang from brew-installed LLVM rather than the one bundled in the pypi package on your MacOS environment.