ned14 / pcpp

A C99 preprocessor written in pure Python
Other
220 stars 41 forks source link

Unfound includes passed through despite NO `--passthru-unfound-includes` #85

Open mara004 opened 11 months ago

mara004 commented 11 months ago

I'm getting unfound includes passed through to the output file although I did not specify --passthru-unfound-includes. Am I using the tool wrongly ... ?

ned14 commented 11 months ago

A repro would be nice.

mara004 commented 11 months ago

I doubt if it's specific to anything. Anyway, here's a repro:

pcpp headers/fpdfview.h --line-directive > preproc.h reports headers/fpdfview.h:23 error: Include file 'stddef.h' not found but the output file retains the #include <stddef.h>.

Input: fpdfview.h.txt (from pdfium) Output: preproc.h.txt


What I was actually experimenting with is calling pcpp through pypdfium2 ctypesgen:

# in //pypdfium2/data/bindings/, assuming a prior call to `./run emplace`
ctypesgen -i headers/*.h -o out.py -l pdfium -L ../linux_x64/ --cpp "pcpp --line-directive" --save-preprocessed-header preproc.h

but ctypesgen's parser fails at the include retained by pcpp, resulting in an empty output file.

ctypesgen expects a gcc or clang compatible C pre-processor, but I notice pcpp seems to behave somewhat differently. For one thing, it doesn't understand system includes on its own.

(I've been looking for a pure-python solution to relieve the dependence on system packages.)

mara004 commented 11 months ago

FWIW I could get rid of all unfound includes by adding -I /usr/lib/gcc/x86_64-redhat-linux/12/include -I /usr/local/include -I /usr/include -D __x86_64__ -D __LP64__. However, ctypesgen's parser still isn't able to work with the output, reporting

ERROR: <input>:5103: Syntax error at 'int_least8_t'
ERROR: <input>:6241: Syntax error at '*'
ERROR: <input>:6258: Syntax error at '*'
ERROR: <input>:6261: Syntax error at 'FPDF_BOOL'
ERROR: <input>:6407: Syntax error at 'uint8_t'
ERROR: <input>:6409: Syntax error at 'size_t'
ERROR: <input>:6493: Syntax error at 'uint32_t'
ERROR: <input>:6495: Syntax error at 'float'
ERROR: <input>:6506: Syntax error at 'uint32_t'

It seems like pcpp does not actually resolve these names, whereas clang does:

// present with clang, missing with pcpp
typedef __uint8_t uint8_t;
ned14 commented 11 months ago

pcpp headers/fpdfview.h --line-directive > preproc.h reports headers/fpdfview.h:23 error: Include file 'stddef.h' not found but the output file retains the #include .

pcpp also returns a failure exit code. The file output is a "best effort" in that circumstance, and will emit those header files it failed to include as-is. It's up to what calls pcpp to handle things appropriately.

I did:

pcpp -o pcpp.h fpdfview.h -I /usr/lib/gcc/x86_64-linux-gnu/12/include/ -D __x86_64__ -D __LP64__ --line-directive=#
gcc pcpp.h
clang pcpp.h

... and got no error from either GCC or clang.

I then did:

gcc -E fpdfview.h > gcc.h

... and compared pcpp's and GCC's output, and they were identical apart from whitespace in places.

Both outputs are attached so you can see for yourself.

pcpp.h.txt gcc.h.txt

mara004 commented 11 months ago

... and compared pcpp's and GCC's output, and they were identical apart from whitespace in places.

Can you retry with all pdfium headers (download link) ? Unfortunately the types in question in my previous comment are only included in other headers (e.g. fpdf_edit.h), not fpdfview.h, it seems.

mara004 commented 11 months ago

The file output is a "best effort" in that circumstance, and will emit those header files it failed to include as-is. It's up to what calls pcpp to handle things appropriately.

@ned14 However, then the question is, why is there --passthru-unfound-includes in the first place? The existence of this flag implies the default behavior would be not to pass through unfound includes, and instead only if asked for explicitly by the caller -- but that's not how it is, which is confusing.

ned14 commented 10 months ago

Unless you specify all of the built-in macros which GCC has i.e.

echo | gcc-12 -dM -E - | sort
#define _LP64 1
#define _STDC_PREDEF_H 1
#define __ATOMIC_ACQUIRE 2
#define __ATOMIC_ACQ_REL 4
#define __ATOMIC_CONSUME 1
#define __ATOMIC_HLE_ACQUIRE 65536
#define __ATOMIC_HLE_RELEASE 131072
#define __ATOMIC_RELAXED 0
#define __ATOMIC_RELEASE 3
#define __ATOMIC_SEQ_CST 5
#define __BIGGEST_ALIGNMENT__ 16
#define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
#define __CET__ 3
#define __CHAR16_TYPE__ short unsigned int
#define __CHAR32_TYPE__ unsigned int
#define __CHAR_BIT__ 8
#define __DBL_DECIMAL_DIG__ 17
#define __DBL_DENORM_MIN__ ((double)4.94065645841246544176568792868221372e-324L)
#define __DBL_DIG__ 15
#define __DBL_EPSILON__ ((double)2.22044604925031308084726333618164062e-16L)
#define __DBL_HAS_DENORM__ 1
#define __DBL_HAS_INFINITY__ 1
#define __DBL_HAS_QUIET_NAN__ 1
#define __DBL_IS_IEC_60559__ 2
#define __DBL_MANT_DIG__ 53
#define __DBL_MAX_10_EXP__ 308
#define __DBL_MAX_EXP__ 1024
#define __DBL_MAX__ ((double)1.79769313486231570814527423731704357e+308L)
#define __DBL_MIN_10_EXP__ (-307)
#define __DBL_MIN_EXP__ (-1021)
#define __DBL_MIN__ ((double)2.22507385850720138309023271733240406e-308L)
#define __DBL_NORM_MAX__ ((double)1.79769313486231570814527423731704357e+308L)
#define __DEC128_EPSILON__ 1E-33DL
#define __DEC128_MANT_DIG__ 34
#define __DEC128_MAX_EXP__ 6145
#define __DEC128_MAX__ 9.999999999999999999999999999999999E6144DL
#define __DEC128_MIN_EXP__ (-6142)
#define __DEC128_MIN__ 1E-6143DL
#define __DEC128_SUBNORMAL_MIN__ 0.000000000000000000000000000000001E-6143DL
#define __DEC32_EPSILON__ 1E-6DF
#define __DEC32_MANT_DIG__ 7
#define __DEC32_MAX_EXP__ 97
#define __DEC32_MAX__ 9.999999E96DF
#define __DEC32_MIN_EXP__ (-94)
#define __DEC32_MIN__ 1E-95DF
#define __DEC32_SUBNORMAL_MIN__ 0.000001E-95DF
#define __DEC64_EPSILON__ 1E-15DD
#define __DEC64_MANT_DIG__ 16
#define __DEC64_MAX_EXP__ 385
#define __DEC64_MAX__ 9.999999999999999E384DD
#define __DEC64_MIN_EXP__ (-382)
#define __DEC64_MIN__ 1E-383DD
#define __DEC64_SUBNORMAL_MIN__ 0.000000000000001E-383DD
#define __DECIMAL_BID_FORMAT__ 1
#define __DECIMAL_DIG__ 21
#define __DEC_EVAL_METHOD__ 2
#define __ELF__ 1
#define __FINITE_MATH_ONLY__ 0
#define __FLOAT_WORD_ORDER__ __ORDER_LITTLE_ENDIAN__
#define __FLT128_DECIMAL_DIG__ 36
#define __FLT128_DENORM_MIN__ 6.47517511943802511092443895822764655e-4966F128
#define __FLT128_DIG__ 33
#define __FLT128_EPSILON__ 1.92592994438723585305597794258492732e-34F128
#define __FLT128_HAS_DENORM__ 1
#define __FLT128_HAS_INFINITY__ 1
#define __FLT128_HAS_QUIET_NAN__ 1
#define __FLT128_IS_IEC_60559__ 2
#define __FLT128_MANT_DIG__ 113
#define __FLT128_MAX_10_EXP__ 4932
#define __FLT128_MAX_EXP__ 16384
#define __FLT128_MAX__ 1.18973149535723176508575932662800702e+4932F128
#define __FLT128_MIN_10_EXP__ (-4931)
#define __FLT128_MIN_EXP__ (-16381)
#define __FLT128_MIN__ 3.36210314311209350626267781732175260e-4932F128
#define __FLT128_NORM_MAX__ 1.18973149535723176508575932662800702e+4932F128
#define __FLT16_DECIMAL_DIG__ 5
#define __FLT16_DENORM_MIN__ 5.96046447753906250000000000000000000e-8F16
#define __FLT16_DIG__ 3
#define __FLT16_EPSILON__ 9.76562500000000000000000000000000000e-4F16
#define __FLT16_HAS_DENORM__ 1
#define __FLT16_HAS_INFINITY__ 1
#define __FLT16_HAS_QUIET_NAN__ 1
#define __FLT16_IS_IEC_60559__ 2
#define __FLT16_MANT_DIG__ 11
#define __FLT16_MAX_10_EXP__ 4
#define __FLT16_MAX_EXP__ 16
#define __FLT16_MAX__ 6.55040000000000000000000000000000000e+4F16
#define __FLT16_MIN_10_EXP__ (-4)
#define __FLT16_MIN_EXP__ (-13)
#define __FLT16_MIN__ 6.10351562500000000000000000000000000e-5F16
#define __FLT16_NORM_MAX__ 6.55040000000000000000000000000000000e+4F16
#define __FLT32X_DECIMAL_DIG__ 17
#define __FLT32X_DENORM_MIN__ 4.94065645841246544176568792868221372e-324F32x
#define __FLT32X_DIG__ 15
#define __FLT32X_EPSILON__ 2.22044604925031308084726333618164062e-16F32x
#define __FLT32X_HAS_DENORM__ 1
#define __FLT32X_HAS_INFINITY__ 1
#define __FLT32X_HAS_QUIET_NAN__ 1
#define __FLT32X_IS_IEC_60559__ 2
#define __FLT32X_MANT_DIG__ 53
#define __FLT32X_MAX_10_EXP__ 308
#define __FLT32X_MAX_EXP__ 1024
#define __FLT32X_MAX__ 1.79769313486231570814527423731704357e+308F32x
#define __FLT32X_MIN_10_EXP__ (-307)
#define __FLT32X_MIN_EXP__ (-1021)
#define __FLT32X_MIN__ 2.22507385850720138309023271733240406e-308F32x
#define __FLT32X_NORM_MAX__ 1.79769313486231570814527423731704357e+308F32x
#define __FLT32_DECIMAL_DIG__ 9
#define __FLT32_DENORM_MIN__ 1.40129846432481707092372958328991613e-45F32
#define __FLT32_DIG__ 6
#define __FLT32_EPSILON__ 1.19209289550781250000000000000000000e-7F32
#define __FLT32_HAS_DENORM__ 1
#define __FLT32_HAS_INFINITY__ 1
#define __FLT32_HAS_QUIET_NAN__ 1
#define __FLT32_IS_IEC_60559__ 2
#define __FLT32_MANT_DIG__ 24
#define __FLT32_MAX_10_EXP__ 38
#define __FLT32_MAX_EXP__ 128
#define __FLT32_MAX__ 3.40282346638528859811704183484516925e+38F32
#define __FLT32_MIN_10_EXP__ (-37)
#define __FLT32_MIN_EXP__ (-125)
#define __FLT32_MIN__ 1.17549435082228750796873653722224568e-38F32
#define __FLT32_NORM_MAX__ 3.40282346638528859811704183484516925e+38F32
#define __FLT64X_DECIMAL_DIG__ 21
#define __FLT64X_DENORM_MIN__ 3.64519953188247460252840593361941982e-4951F64x
#define __FLT64X_DIG__ 18
#define __FLT64X_EPSILON__ 1.08420217248550443400745280086994171e-19F64x
#define __FLT64X_HAS_DENORM__ 1
#define __FLT64X_HAS_INFINITY__ 1
#define __FLT64X_HAS_QUIET_NAN__ 1
#define __FLT64X_IS_IEC_60559__ 2
#define __FLT64X_MANT_DIG__ 64
#define __FLT64X_MAX_10_EXP__ 4932
#define __FLT64X_MAX_EXP__ 16384
#define __FLT64X_MAX__ 1.18973149535723176502126385303097021e+4932F64x
#define __FLT64X_MIN_10_EXP__ (-4931)
#define __FLT64X_MIN_EXP__ (-16381)
#define __FLT64X_MIN__ 3.36210314311209350626267781732175260e-4932F64x
#define __FLT64X_NORM_MAX__ 1.18973149535723176502126385303097021e+4932F64x
#define __FLT64_DECIMAL_DIG__ 17
#define __FLT64_DENORM_MIN__ 4.94065645841246544176568792868221372e-324F64
#define __FLT64_DIG__ 15
#define __FLT64_EPSILON__ 2.22044604925031308084726333618164062e-16F64
#define __FLT64_HAS_DENORM__ 1
#define __FLT64_HAS_INFINITY__ 1
#define __FLT64_HAS_QUIET_NAN__ 1
#define __FLT64_IS_IEC_60559__ 2
#define __FLT64_MANT_DIG__ 53
#define __FLT64_MAX_10_EXP__ 308
#define __FLT64_MAX_EXP__ 1024
#define __FLT64_MAX__ 1.79769313486231570814527423731704357e+308F64
#define __FLT64_MIN_10_EXP__ (-307)
#define __FLT64_MIN_EXP__ (-1021)
#define __FLT64_MIN__ 2.22507385850720138309023271733240406e-308F64
#define __FLT64_NORM_MAX__ 1.79769313486231570814527423731704357e+308F64
#define __FLT_DECIMAL_DIG__ 9
#define __FLT_DENORM_MIN__ 1.40129846432481707092372958328991613e-45F
#define __FLT_DIG__ 6
#define __FLT_EPSILON__ 1.19209289550781250000000000000000000e-7F
#define __FLT_EVAL_METHOD_TS_18661_3__ 0
#define __FLT_EVAL_METHOD__ 0
#define __FLT_HAS_DENORM__ 1
#define __FLT_HAS_INFINITY__ 1
#define __FLT_HAS_QUIET_NAN__ 1
#define __FLT_IS_IEC_60559__ 2
#define __FLT_MANT_DIG__ 24
#define __FLT_MAX_10_EXP__ 38
#define __FLT_MAX_EXP__ 128
#define __FLT_MAX__ 3.40282346638528859811704183484516925e+38F
#define __FLT_MIN_10_EXP__ (-37)
#define __FLT_MIN_EXP__ (-125)
#define __FLT_MIN__ 1.17549435082228750796873653722224568e-38F
#define __FLT_NORM_MAX__ 3.40282346638528859811704183484516925e+38F
#define __FLT_RADIX__ 2
#define __FXSR__ 1
#define __GCC_ASM_FLAG_OUTPUTS__ 1
#define __GCC_ATOMIC_BOOL_LOCK_FREE 2
#define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
#define __GCC_ATOMIC_CHAR_LOCK_FREE 2
#define __GCC_ATOMIC_INT_LOCK_FREE 2
#define __GCC_ATOMIC_LLONG_LOCK_FREE 2
#define __GCC_ATOMIC_LONG_LOCK_FREE 2
#define __GCC_ATOMIC_POINTER_LOCK_FREE 2
#define __GCC_ATOMIC_SHORT_LOCK_FREE 2
#define __GCC_ATOMIC_TEST_AND_SET_TRUEVAL 1
#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
#define __GCC_CONSTRUCTIVE_SIZE 64
#define __GCC_DESTRUCTIVE_SIZE 64
#define __GCC_HAVE_DWARF2_CFI_ASM 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1
#define __GCC_IEC_559 2
#define __GCC_IEC_559_COMPLEX 2
#define __GNUC_EXECUTION_CHARSET_NAME "UTF-8"
#define __GNUC_MINOR__ 3
#define __GNUC_PATCHLEVEL__ 0
#define __GNUC_STDC_INLINE__ 1
#define __GNUC_WIDE_EXECUTION_CHARSET_NAME "UTF-32LE"
#define __GNUC__ 12
#define __GXX_ABI_VERSION 1017
#define __HAVE_SPECULATION_SAFE_VALUE 1
#define __INT16_C(c) c
#define __INT16_MAX__ 0x7fff
#define __INT16_TYPE__ short int
#define __INT32_C(c) c
#define __INT32_MAX__ 0x7fffffff
#define __INT32_TYPE__ int
#define __INT64_C(c) c ## L
#define __INT64_MAX__ 0x7fffffffffffffffL
#define __INT64_TYPE__ long int
#define __INT8_C(c) c
#define __INT8_MAX__ 0x7f
#define __INT8_TYPE__ signed char
#define __INTMAX_C(c) c ## L
#define __INTMAX_MAX__ 0x7fffffffffffffffL
#define __INTMAX_TYPE__ long int
#define __INTMAX_WIDTH__ 64
#define __INTPTR_MAX__ 0x7fffffffffffffffL
#define __INTPTR_TYPE__ long int
#define __INTPTR_WIDTH__ 64
#define __INT_FAST16_MAX__ 0x7fffffffffffffffL
#define __INT_FAST16_TYPE__ long int
#define __INT_FAST16_WIDTH__ 64
#define __INT_FAST32_MAX__ 0x7fffffffffffffffL
#define __INT_FAST32_TYPE__ long int
#define __INT_FAST32_WIDTH__ 64
#define __INT_FAST64_MAX__ 0x7fffffffffffffffL
#define __INT_FAST64_TYPE__ long int
#define __INT_FAST64_WIDTH__ 64
#define __INT_FAST8_MAX__ 0x7f
#define __INT_FAST8_TYPE__ signed char
#define __INT_FAST8_WIDTH__ 8
#define __INT_LEAST16_MAX__ 0x7fff
#define __INT_LEAST16_TYPE__ short int
#define __INT_LEAST16_WIDTH__ 16
#define __INT_LEAST32_MAX__ 0x7fffffff
#define __INT_LEAST32_TYPE__ int
#define __INT_LEAST32_WIDTH__ 32
#define __INT_LEAST64_MAX__ 0x7fffffffffffffffL
#define __INT_LEAST64_TYPE__ long int
#define __INT_LEAST64_WIDTH__ 64
#define __INT_LEAST8_MAX__ 0x7f
#define __INT_LEAST8_TYPE__ signed char
#define __INT_LEAST8_WIDTH__ 8
#define __INT_MAX__ 0x7fffffff
#define __INT_WIDTH__ 32
#define __LDBL_DECIMAL_DIG__ 21
#define __LDBL_DENORM_MIN__ 3.64519953188247460252840593361941982e-4951L
#define __LDBL_DIG__ 18
#define __LDBL_EPSILON__ 1.08420217248550443400745280086994171e-19L
#define __LDBL_HAS_DENORM__ 1
#define __LDBL_HAS_INFINITY__ 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __LDBL_IS_IEC_60559__ 2
#define __LDBL_MANT_DIG__ 64
#define __LDBL_MAX_10_EXP__ 4932
#define __LDBL_MAX_EXP__ 16384
#define __LDBL_MAX__ 1.18973149535723176502126385303097021e+4932L
#define __LDBL_MIN_10_EXP__ (-4931)
#define __LDBL_MIN_EXP__ (-16381)
#define __LDBL_MIN__ 3.36210314311209350626267781732175260e-4932L
#define __LDBL_NORM_MAX__ 1.18973149535723176502126385303097021e+4932L
#define __LONG_LONG_MAX__ 0x7fffffffffffffffLL
#define __LONG_LONG_WIDTH__ 64
#define __LONG_MAX__ 0x7fffffffffffffffL
#define __LONG_WIDTH__ 64
#define __LP64__ 1
#define __MMX_WITH_SSE__ 1
#define __MMX__ 1
#define __NO_INLINE__ 1
#define __ORDER_BIG_ENDIAN__ 4321
#define __ORDER_LITTLE_ENDIAN__ 1234
#define __ORDER_PDP_ENDIAN__ 3412
#define __PIC__ 2
#define __PIE__ 2
#define __PRAGMA_REDEFINE_EXTNAME 1
#define __PTRDIFF_MAX__ 0x7fffffffffffffffL
#define __PTRDIFF_TYPE__ long int
#define __PTRDIFF_WIDTH__ 64
#define __REGISTER_PREFIX__ 
#define __SCHAR_MAX__ 0x7f
#define __SCHAR_WIDTH__ 8
#define __SEG_FS 1
#define __SEG_GS 1
#define __SHRT_MAX__ 0x7fff
#define __SHRT_WIDTH__ 16
#define __SIG_ATOMIC_MAX__ 0x7fffffff
#define __SIG_ATOMIC_MIN__ (-__SIG_ATOMIC_MAX__ - 1)
#define __SIG_ATOMIC_TYPE__ int
#define __SIG_ATOMIC_WIDTH__ 32
#define __SIZEOF_DOUBLE__ 8
#define __SIZEOF_FLOAT128__ 16
#define __SIZEOF_FLOAT80__ 16
#define __SIZEOF_FLOAT__ 4
#define __SIZEOF_INT128__ 16
#define __SIZEOF_INT__ 4
#define __SIZEOF_LONG_DOUBLE__ 16
#define __SIZEOF_LONG_LONG__ 8
#define __SIZEOF_LONG__ 8
#define __SIZEOF_POINTER__ 8
#define __SIZEOF_PTRDIFF_T__ 8
#define __SIZEOF_SHORT__ 2
#define __SIZEOF_SIZE_T__ 8
#define __SIZEOF_WCHAR_T__ 4
#define __SIZEOF_WINT_T__ 4
#define __SIZE_MAX__ 0xffffffffffffffffUL
#define __SIZE_TYPE__ long unsigned int
#define __SIZE_WIDTH__ 64
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSP_STRONG__ 3
#define __STDC_HOSTED__ 1
#define __STDC_IEC_559_COMPLEX__ 1
#define __STDC_IEC_559__ 1
#define __STDC_IEC_60559_BFP__ 201404L
#define __STDC_IEC_60559_COMPLEX__ 201404L
#define __STDC_ISO_10646__ 201706L
#define __STDC_UTF_16__ 1
#define __STDC_UTF_32__ 1
#define __STDC_VERSION__ 201710L
#define __STDC__ 1
#define __UINT16_C(c) c
#define __UINT16_MAX__ 0xffff
#define __UINT16_TYPE__ short unsigned int
#define __UINT32_C(c) c ## U
#define __UINT32_MAX__ 0xffffffffU
#define __UINT32_TYPE__ unsigned int
#define __UINT64_C(c) c ## UL
#define __UINT64_MAX__ 0xffffffffffffffffUL
#define __UINT64_TYPE__ long unsigned int
#define __UINT8_C(c) c
#define __UINT8_MAX__ 0xff
#define __UINT8_TYPE__ unsigned char
#define __UINTMAX_C(c) c ## UL
#define __UINTMAX_MAX__ 0xffffffffffffffffUL
#define __UINTMAX_TYPE__ long unsigned int
#define __UINTPTR_MAX__ 0xffffffffffffffffUL
#define __UINTPTR_TYPE__ long unsigned int
#define __UINT_FAST16_MAX__ 0xffffffffffffffffUL
#define __UINT_FAST16_TYPE__ long unsigned int
#define __UINT_FAST32_MAX__ 0xffffffffffffffffUL
#define __UINT_FAST32_TYPE__ long unsigned int
#define __UINT_FAST64_MAX__ 0xffffffffffffffffUL
#define __UINT_FAST64_TYPE__ long unsigned int
#define __UINT_FAST8_MAX__ 0xff
#define __UINT_FAST8_TYPE__ unsigned char
#define __UINT_LEAST16_MAX__ 0xffff
#define __UINT_LEAST16_TYPE__ short unsigned int
#define __UINT_LEAST32_MAX__ 0xffffffffU
#define __UINT_LEAST32_TYPE__ unsigned int
#define __UINT_LEAST64_MAX__ 0xffffffffffffffffUL
#define __UINT_LEAST64_TYPE__ long unsigned int
#define __UINT_LEAST8_MAX__ 0xff
#define __UINT_LEAST8_TYPE__ unsigned char
#define __USER_LABEL_PREFIX__ 
#define __VERSION__ "12.3.0"
#define __WCHAR_MAX__ 0x7fffffff
#define __WCHAR_MIN__ (-__WCHAR_MAX__ - 1)
#define __WCHAR_TYPE__ int
#define __WCHAR_WIDTH__ 32
#define __WINT_MAX__ 0xffffffffU
#define __WINT_MIN__ 0U
#define __WINT_TYPE__ unsigned int
#define __WINT_WIDTH__ 32
#define __amd64 1
#define __amd64__ 1
#define __code_model_small__ 1
#define __gnu_linux__ 1
#define __k8 1
#define __k8__ 1
#define __linux 1
#define __linux__ 1
#define __pic__ 2
#define __pie__ 2
#define __unix 1
#define __unix__ 1
#define __x86_64 1
#define __x86_64__ 1
#define linux 1
#define unix 1

then pcpp will generate output like:

typedef __INT_LEAST8_TYPE__ int_least8_t;
typedef __INT_LEAST16_TYPE__ int_least16_t;
typedef __INT_LEAST32_TYPE__ int_least32_t;
typedef __INT_LEAST64_TYPE__ int_least64_t;
typedef __UINT_LEAST8_TYPE__ uint_least8_t;

because it hasn't been told to do otherwise.

Generally it's much easier to have pcpp output inclusion of <stdint.h> etc instead of having it mux in the contents of <stdint.h>.

mara004 commented 10 months ago

Ah, so if we passed all the gcc/clang default defines to pcpp, it would include e.g. the uint8_t definition in the output?

Generally it's much easier to have pcpp output inclusion of etc instead of having it mux in the contents of .

ctypesgen needs a fully parsed output that aligns with "real" C pre-processors. AFAIK, the partial form just isn't eligible for our use case.

ned14 commented 10 months ago

Ah, so if we passed all the gcc/clang default defines to pcpp, it would include e.g. the uint8_t definition in the output?

Correct.

BUT gcc/clang's default defines vary according to configuration, architecture, platform and version. I don't know how you'd avoid invoking gcc or clang to go get its default defines, and if you have to bother doing that, you might as well have it do the preprocessing too.

mara004 commented 8 months ago

I made some progress, but still didn't manage to get ctypesgen play with pcpp. Here's what I did:

gcc -dM -E - < /dev/null > ../default_defs.h
ctypesgen -i fpdf*.h -l pdfium -L . -o ../bindings.py --preproc-savepath ../preproc_out.h --cpp "pcpp -I . -I /usr/lib/gcc/x86_64-redhat-linux/12/include -I /usr/local/include -I /usr/include --line-directive '#' ../default_defs.h"

It fails with

ERROR: /tmp/tmpl9bgbl9b.h:0: Scanning error. Illegal character '#'; # include_next <stdint.h>

because pcpp retained the # include_next. ~~Any retained include is illegal input for ctypesgen. It would be better for our purposes if an include that failed to expand would just be excluded from the output.~~

Note that pcpp did not return an error code or log the failure in this case.

mara004 commented 8 months ago

Ah, wait, actually the above already partially works. 🎉 ctypesgen does produce a list of members, they just aren't included, so we need to add --all-headers. Probably that is because pcpp did not resolve the file paths, so ctypesgen's matching rules failed.

Updated command:

ctypesgen --all-headers -l pdfium -L . -i fpdf*.h -o ../bindings.py --preproc-savepath ../preproc_out.h --cpp "pcpp -I . -I /usr/lib/gcc/x86_64-redhat-linux/12/include -I /usr/local/include -I /usr/include --line-directive '#' --passthru-defines ../default_defs.h"
ned14 commented 8 months ago

Cool

mara004 commented 8 months ago

However, the resulting bindings are still incomplete. When plugging into pypdfium2, the test suite shows the following error:

src/pypdfium2/internal/consts.py:66: in <module>
    pdfium_c.FPDF_COLORSPACE_UNKNOWN:    "?",
E   AttributeError: module 'pypdfium2.raw' has no attribute 'FPDF_COLORSPACE_UNKNOWN'

It seems like some macro constant defines are missing.