haxscramper / hcparse

High-level nim bindings for parsing C/C++ code
https://haxscramper.github.io/hcparse-doc/src/hcparse/libclang.html
Apache License 2.0
37 stars 2 forks source link

Wrapping macro and conditional compilation semantics #21

Open haxscramper opened 2 years ago

haxscramper commented 2 years ago

C macros can do anything - create types, procs, conditionally enable parts of the code or define simple constants. All of this has to be somehow converted to nim macros if possible. Boost wave should be pretty useful for solving this problem.

Many macro definitions are relatively simple, which means there are multiple steps of different complexity:

haxscramper commented 2 years ago

https://code.woboq.org/gtk/include/glib-2.0/gobject/gtype.h.html#_M/G_DECLARE_INTERFACE

haxscramper commented 2 years ago

Algorithm description from the IRC discussion -

I think the only way to properly map conditional compilation and related logic is to try and get the proper /AST/ of the macro, or #ifdef/elif/ifndef/if check. Parsing preprocessor statements is easier - I already have a boost-wave based lexer that can report all the tokens encountered, assembling elements to the tree structure is not that difficult. The main problem is a macro defines themselves - if the replacement list does not contain any special metaprogramming tokens (## and # for concatenation and stringification respectively) it could be parsed directly using tree-sitter. This functionality is related to the https://github.com/haxscramper/hcparse/issues/12

If the macro contains concatenation tokens it becomes somewhat more problematic, but those case be handled by replacing <tok> ## <tok> with specially constructed token names like TOK_INDEX_1_CONCAT_TOK_INDEX_2`.

But we still to operate under the assumption that the token body forms a valid C code, which is of course not the case.

haxscramper commented 2 years ago

It is necessary to overlay conditional compilation on top of the API description. That is - CxxEntry should itself might information about conditional compilation. This would make type graph construction a lot more complex, since various groupings must be considered at once in order to deal with mutually recursive type uses wrapped in conditional compilation logic.