pypdfium2-team / ctypesgen

Wrapper generator for Python ctypes
BSD 2-Clause "Simplified" License
1 stars 0 forks source link

Fork overview, and thoughts to improve basis for chance of upstreaming #1

Open mara004 opened 7 months ago

mara004 commented 7 months ago

See below for an overview of this fork. Note, this writeup is a non-exhaustive work in progress.

This information may be valuable for working towards a basis that could be merged back into upstream at some point, though this seems fairly hypothetical for the near term, given time constraints, and mismatched design intents (e.g. relating to backwards compatibility).

However, this fork of ctypesgen may be a good starting point for any active future development, with a significantly overhauled code base that should be nicer to work with.

Selection of improvements from this fork

small, self-contained fixes have usually been submitted upstream and may have been merged

[^no_symbol_guards]: Note, this is meant for use with inherently ABI correct packaging only

Points to consider

Other notes

Done tasks

mara004 commented 7 months ago

Dumping my lean string class replacement draft below as it's not very well visible in the PR diff. This may be a slightly updated version.

class _wraps_c_char_p:
    def __init__(self, raw, value):
        self.raw = raw
        self.value = value

    # provided for clarity, not actually necessary due to __getattr__ wrapper below
    def decode(self, encoding="utf-8", errors="strict"):
        return self.value.decode(encoding, errors=errors)

    def __str__(self):
        return self.decode()

    def __getattr__(self, attr):
        return getattr(self.value, attr)

class String(ctypes.c_char_p):
    @classmethod
    def _check_retval_(cls, result):
        value = result.value
        return value if value is None else _wraps_c_char_p(result, value)

    @classmethod
    def from_param(cls, obj):
        if isinstance(obj, str):
            obj = obj.encode("utf-8")
        return super().from_param(obj)
mara004 commented 7 months ago

Another improvement that comes to my mind for autostrings would be making the kind of encoding configurable.

e.g. pdfium mostly uses UTF16LE, so autostrings with this might actually be convenient for pypdfium2, though formally a default encoding remains a problem - it would still be a concern with any APIs that use other encodings, like UTF-8 or ASCII.