apache / arrow-nanoarrow

Helpers for Arrow C Data & Arrow C Stream interfaces
https://arrow.apache.org/nanoarrow
Apache License 2.0
167 stars 35 forks source link

When generating .pxd, warn for public but unexported functions #563

Open bkietz opened 1 month ago

bkietz commented 1 month ago

It's possible to add a function to inline_array.h or other header included by nanoarrow.h which is nevertheless not included in generated .pxd. The error is then simply "no matching function" when compiling the .pxd, which is not informative. It'd be better if the pxd could recognize function definitions or declarations which aren't translatable into a cdef, then warn if those aren't on an exclusion list.

paleolimbot commented 1 month ago

That's a great point! The .pxd generator is really helpful but pretty wild:

https://github.com/apache/arrow-nanoarrow/blob/fcf3a809f8b8f8facfb8f29284c006429cc91d49/python/bootstrap.py#L24-L170

The functions that are included in the .pxd match a very narrow regex that only includes declarations (not definitions). I forget exactly why this happened but I think allowing it to end with something other than ); caused some problems (it's tricky because functions definitions can span multiple lines and contain nested parentheses).

When the .pxd generator was written we had more constraints than we do now: we used to allow downloading a pre-concatenated nanoarrow.h/c to support installing from github when CMake wasn't installed, but now we have nightly builds and the concatenator is written in Python. We could now probably use pycparser to find declarations or definitions (pure Python is preferable because Windows).