ctypesgen / ctypesgen

Pure-python wrapper generator for ctypes
BSD 2-Clause "Simplified" License
305 stars 86 forks source link

Reliable Clang-based parser #214

Open dmikushin opened 7 months ago

dmikushin commented 7 months ago

Dear all,

I would like to draw your attention to the possibility of having a Python bindings generator powered by a complete C language parser based on Clang: https://github.com/dmikushin/ctypesgen-ng This work develops the original version by @osandov . Basically, the Clang is deployed via Python API to parse the C header and generate Python bindings. The advantage of this approach is that any modern C language constructs are automatically supported out of the box, without a risk of any little detail being missed by a hand-made parser.

In the end I think the two project could be merged into one gathering best ideas of the two. What do you think?

mara004 commented 7 months ago

Related: https://github.com/ctypesgen/ctypesgen/issues/213

mara004 commented 7 months ago

It's certainly interesting what alternatives there might be, but there is value in diversity. While I agree delegating parsing to clang may have its advantages, I also don't want to lose ctypesgen's pure-python C parser. Therefore, I'm skeptical of the idea of merging with ctypesgen, apart from doubts about technical and personal feasibility. That would probably mean an independent project.

If you are interested in a joined-effort clang based generator, I'd suggest that you first head over to @trolldbois' ctypeslib, which seems to me to be the more mature base.

dmikushin commented 7 months ago

Hi @mara004 , thank you for this explanation. Yes, we've definitely missed ctypeslib, and will look into it. I also prefer your project to the others, maybe "merging" was not a precise word. I rather meant that your ctypesgen should incorporate some features of the others. If you have a plan in this regard, please share what you think, and we can spare the efforts!

mara004 commented 7 months ago

I rather meant that your ctypesgen should incorporate some features of the others. If you have a plan in this regard, please share what you think, and we can spare the efforts!

Which features precisely? Under merging, I basically understood incorporating the clang parser into ctypesgen, which is what I'm doubtful about. I don't intend to (or can't) work on this in the foreseeable future.

In general, I prefer to improve the higher-level parts of ctypesgen (see here for some ideas). My insights into the parser backend are very limited. Admittedly, I know nothing about the C language on its own, just what I've learned from practical ctypes usage.[^1]

[^1]: Disclaimer: I'm not actually a maintainer here. I can only speak for my forked codebase (see #195 for background).

drsteve commented 7 months ago

I occasionally watch this repo, and while not a developer, maintained, or even frequent user, I do like having a Python-based parser to generate Python bindings. I don't always have the luxury of installing a new compiler stack or running in a docker container. I would be far less likely to use a bindings generator that was clang-based. I recognize that it's also desirable for some folks.

It's nice to see a broader ecosystem, so having clang-based options coordinate would strengthen that. It's also very useful to maintain ctypesgen (and thanks to @mara004 for being so active, even if not as a "maintainer") with its current philosophy of not caring which compiler or analyzer you have available.

If there are more general "lessons learned" that can feed back into either package then it'd be good, but I'm skeptical about relying on clang for this package. (I will point some folks I know to the clang-based options now I am aware of them, as well as ctypesgen)

mara004 commented 7 months ago

I do like having a Python-based parser to generate Python bindings. I don't always have the luxury of installing a new compiler stack or running in a docker container.

However, note that ctypesgen as a whole is not strictly pure-python (although the parser is): It does require an external C pre-processor. In practice, that means gcc or clang.

There is also pcpp, which is pure-python, but it has some behavioral differences, and I didn't manage to make it play with ctypesgen yet. In particular, pcpp lacks the default defines (and include paths) provided by gcc/clang. Consequently, it tends to produce incomplete output.

mara004 commented 7 months ago

Making progress on pcpp. This actually seems to work:

gcc -dM -E - < /dev/null > default_defs.h
ctypesgen --all-headers -l pdfium -i fpdf*.h -o ../out.py --preproc-savepath ../preproc_out.h --cpp "pcpp -I . -I /usr/lib/gcc/x86_64-redhat-linux/12/include -I /usr/local/include -I /usr/include --line-directive '#' --passthru-defines default_defs.h"

See also https://github.com/ned14/pcpp/issues/85