jbms / sphinx-immaterial

Adaptation of the popular mkdocs-material material design theme to the sphinx documentation system
https://jbms.github.io/sphinx-immaterial/
Other
202 stars 32 forks source link

C++ API gen with macros #386

Open MarcelKoch opened 1 month ago

MarcelKoch commented 1 month ago

I tried to use the c++ documentation generation on a codebase that has some macros in it, but the sphinx build failed. Mostly macro usage of the form

PROJECT_ATTRIBUTES int some_function();

could not be handled. I've added a similar macro to the demo here: https://github.com/MarcelKoch/sphinx-immaterial/tree/cpp-api-with-macro.

I think there are two underlying issues here. AFAIK, sphinx doesn't really support macros in function declarations. So the rst syntax

.. cpp:function:: PROJECT_ATTRIBUTES int some_function()

would already be invalid. Second, the internal parsing seems to be a bit messed up, or at least the behavior of libclang is unexpected. For the modified demo I get the following error:

Handler <function _builder_inited at 0x7f1ef3ae59e0> for event 'builder-inited' threw an exception (exception: Error when parsing function declaration.
If the function has no return type:
  Error in declarator or parameters-and-qualifiers
  Invalid C++ declaration: Expected identifier in nested name, got keyword: class [error at 12]
    inline class IndexInterval { public : explicit IndexInterval ( int lower , int upper ) ; friend std :: ostream & operator << ( std :: ostream & os , IndexInterval x ) ; int lower ( ) const ; int upper ( ) const ; } ; INLINE IndexInterval Union ( IndexInterval a , IndexInterval b )
    ------------^
If the function has a return type:
  Error in declarator or parameters-and-qualifiers
  If pointer to member declarator:
    Invalid C++ declaration: Expected identifier in nested name. [error at 27]
      inline class IndexInterval { public : explicit IndexInterval ( int lower , int upper ) ; friend std :: ostream & operator << ( std :: ostream & os , IndexInterval x ) ; int lower ( ) const ; int upper ( ) const ; } ; INLINE IndexInterval Union ( IndexInterval a , IndexInterval b )
      ---------------------------^
  If declarator-id:
    Invalid C++ declaration: Expected identifier in nested name. [error at 27]
      inline class IndexInterval { public : explicit IndexInterval ( int lower , int upper ) ; friend std :: ostream & operator << ( std :: ostream & os , IndexInterval x ) ; int lower ( ) const ; int upper ( ) const ; } ; INLINE IndexInterval Union ( IndexInterval a , IndexInterval b )
      ---------------------------^

After playing around with the debugger, I've noticed that the tokenization of the source location of the function somehow includes the complete source code between the macro definition and the macro usage.

One solution would be to run the preprocessor on the input file, however this would also remove other macros, which I would like to document.

2bndy5 commented 1 month ago

After playing around with the debugger, I've noticed that the tokenization of the source location of the function somehow includes the complete source code between the macro definition and the macro usage.

Hmm, sounds like a bug in the api_parser. With libclang we had to crawl the entire AST and extract doc comments from lone comment tokens (with special considerations). Libclang's default behavior only identifies a comment as a doc comment if it directly precedes the symbol in question. We might need to strengthen the parser by differentiating between declaration token and invocation token. I haven't looked at the parser code in a while though.

PS - Sphinx' C and C++ domains leave a lot to be desired (they are rather isolated from each other). The Sphinx devs that contributed those parts are rarely active in development nowadays. For example, this theme applies patches to help automatically cross-reference symbols in the C domain from signatures in the C++ domain.

jbms commented 1 month ago

I didn't think about this issue previously, but the way that it can be made to work is to first preprocess with -E -C -dD which will preserve macro definitions and comments.

I don't know if libclang provides a way to access the tokens produced from preprocessing rather than just the input tokens.

MarcelKoch commented 1 month ago

@jbms these option definitely help, and with those the documentation can be extracted. However, there is still an odd thing happening. Although the macro is expanded, its value (inline in this case) does not appear in the rendered documentation. I would guess that this has again something to do with the parsing, which doesn't account for function attributes.

jbms commented 1 month ago

I intentionally excluded inline since I considered it an implementation detail that doesn't belong in the documentation:

https://github.com/jbms/sphinx-immaterial/blob/3c8fe16a499407a9a9b71b7dd2133c559cdccf95/sphinx_immaterial/apidoc/cpp/api_parser.py#L1183

However, I suppose we could add an option to control that behavior.