GMLC-TDC / cheader2json

libclang parser to dump c header info to json
2 stars 0 forks source link

libclang returns type "int" instead of "int32_t" #54

Open nightlark opened 9 months ago

nightlark commented 9 months ago

libclang is returning a type of "int" (or "Int" depending on the function/attribute used), instead of what it had been returning before -- "int32_t" or "int64_t". It is also adding typedefs for each enum, which wasn't happening before.

This may be a macOS specific bug with libclang (the Python bindings?); a Google search turned up a similar issue with Ubuntu where size_t was getting reported by libclang as being of type int.

Note: running clang -Xclang --ast-dump=json -fsyntax-only <header_file> prints the expected type names, but is extremely verbose and includes information on included header files (C/C++ standard library)... it may be worth looking into alternatives to libclang, maybe some that are a bit more user friendly. Python options are likely the easiest/most direct to port, but other languages would also be fine if the available library is easy enough to use (whatever underpins rust bindgen and similar tooling).

nightlark commented 9 months ago

This appears to be macOS specific behavior -- a CI workflow run on Ubuntu (using HELICS 3.5.0 headers) gave the expected types. helics.ast.json helics.types.json

nightlark commented 9 months ago

Some alternatives to libclang for parsing C/C++ code and getting an AST that have come up recently:

afisher1 commented 9 months ago

This behavior would definitely break our conversion code across the board if we were trying to generate bindings on macOS. Is this specific to the latest version of libclang? Clang 14 is what I have been using for matlab and octave binding generation. We might need to lock down clang to a specific version as a stop gap in our other binding generator repos.

nightlark commented 9 months ago

I noticed it with clang 16 — I haven’t tried older versions of clang (or newer), but it seems like the kind of issue that could affect most versions.

As part of releases we could automate the header to json part of binding generation in a CI job, which would guarantee that the bindings are done on an OS where libclang gives the correct types (…maybe).