capi-workgroup / api-evolution

Issues related to evolution of the C API
14 stars 1 forks source link

Soft-deprecate `PyUnicode_AsUTF8` #39

Open encukou opened 11 months ago

encukou commented 11 months ago

(I'll include the problem here, as the "problems" repo seems "done" now that PEP-733 is up).

Traditional C APIs take zero-terminated strings, which means that Python strings that with embedded NUL bytes appear truncated. There are many ways to get such a char*: converted directly using PyUnicode_AsUTF8, encoded and accessed via PyBytes_AsString, or accessed with something like PyUnicode_AsUTF8AndSize while ignoring the size.

Many APIs that convert to char* raise an error on embedded NUL bytes. On that:


We could:

Notes on some of the issues @vstinner collected in https://github.com/python/cpython/issues/111656#issuecomment-1791730300:

encukou commented 11 months ago

Other approaches that were considered: