Open markshannon opened 1 year ago
We also need a few functions for querying and extracting the value of a Python int.
We want to query its sign:
int PyInt_IsNegative();
int PyInt_IsPositive();
int PyInt_IsZero();
int PyInt_Sign();
We want to import and export the digits of an integer, and to know how many digits there are.
GNU's MP library has mpz_import
and mpz_export
, which have quite a complex API, but might be a good model to use.
In addition we should provide a constant describing the "native" number of bits per digit, so that C extensions can extract the data efficiently.
mpz_import
and mpz_export
take 6 parameters each, and four of those are small numbers describing the layout. Having many int parameters is hard to read and error-prone. We should combine the layout parameters into a single struct (of 32 bits or less).
E.g.
typedef struct _PyIntExportLayout {
uint8_t bits_per_digit,
int8_t word_endian,
int8_t array_endian,
uint8_t digit_size,
} PyIntExportLayout;
PyLongObject *PyInt_Import(PyIntExportLayout layout, size_t count, const void *data);
int PyInt_Import(PyLongObject *op, PyIntExportLayout layout, size_t count, void *data);
size_t PyInt_DigitCount(PyLongObject *op, uint8_t bits_per_digit);
const PyIntExportLayout PY_INT_NATIVE_LAYOUT; /* Use this when possible, for speed */
Hi. I'm the primary maintainer of gmpy2
. I'd like to provide some comments with my experiences using the C-API.
I use PyLong_AsLongAndOverflow
when I want a long
value or immediately proceed with the full conversion of PyLong to mpz
as quickly as possible. Avoiding the exception is a significant performance improvement. PyLong_AsUnsignedLongAndOverflow
is used occasionally when GMP is expects an unsigned long
.
PyLong_As[Unsigned]LongLongAndOverflow
were used with MPIR
to get 64-bit values on Windows. (MPIR
extended GMP
to support 64-bit native integer sizes.) gmpy2
doesn't currently use them but it would be nice if they could be kept.
I like your PyIntExportLayout
idea for specifying the . I have a question about the usage of PyInt_Import
- which side owns the conversion?
Is PyInt_Import
intended to access external data (i.e. the mpz
data) and create a PyLong
? Does PyIntExportLayout
then specify the format of the mpz
data?
Would there be a corresponding PyInt_Export
that exports the value of a PyLong
into an external buffer with the format of the external buffer controlled by PyIntExportLayout
? If so, who owns (CPython versus gmpy2) the memory allocated to the external buffer? (Note: GMP, MPFR, and MPC can use a different memory manager than CPython....)
This is reversed from the current conversion direction. For mpz
to PyLong
, gmpy2
asks CPython to create a new PyLong
with sufficient space to store the output of mpz_export
. And for PyLong
to mpz
, gmpy2
creates a new mpz
with sufficient space to store the output of mpz_import
.
I'll add another comment to the thread about the compact format.
Thanks for all the effort in improving CPython.
casevh
32 bit PyInt_AsInt32 PyInt_FromInt32
I created https://github.com/python/cpython/pull/120390 for that.
I had plans to add PyLong_Import()
and PyLong_Export()
with GMP/libtommath inspired signatures. This is too general interface which allows to support many different representations.
This is too general interface which allows to support many different representations.
This is relatively complex task, which is better suited to dedicated libraries. I would be rather surprised if some arbitrary precision math library lacks mpz_import/export-like functions.
If on CPython side we will have a "view" of integers as an array of digits - the rest of work could do any math library.
Then please used different names than PyLong_Import()
/PyLong_Export()
.
We need a more consistent API for converting from Python integers to C integers and back again. We should support both 32 bit and word size C integers. 32 bit, because we often want to store 32 bit values to save space on 64 bit machines, or for portability. We also want to support word size integers for performance and ease of coding.
I added APIs for that with https://github.com/python/cpython/commit/4c6dca82925bd4be376a3e4a53c8104ad0b0cb5f:
We want to query its sign: int PyInt_Sign();
PyLong_GetSign()
was added to Python 3.14: https://docs.python.org/dev/c-api/long.html#c.PyLong_GetSign
int PyInt_IsNegative(); int PyInt_IsPositive(); int PyInt_IsZero();
There is an open discussion for these functions: https://github.com/capi-workgroup/decisions/issues/29
The C-API has built up over 30 years, in a haphazard way. So, it is no surprise that it is a bit of a mess. What makes it worse is that it is based around the C
long
type, which is varies in size between architectures and operating systems in odd ways. Clong
s are 32 bit on (almost?) all 32 bit machines, 64 bit on most 64 bit machines, except Windows when Clong
s are 32 bits on 64 bit machines. In other words, it is not a useful fixed size, likeint32_t
, nor does match the machine word size, likeintptr_t
.We need a more consistent API for converting from Python integers to C integers and back again. We should support both 32 bit and word size C integers. 32 bit, because we often want to store 32 bit values to save space on 64 bit machines, or for portability. We also want to support word size integers for performance and ease of coding.
This means we want 4 functions (2 sizes, 2 directions) to convert between C and Python integers.
Currently we have:
PyLong_FromSsize_t
The C API has a function to convert Python ints to
intptr_t
, but it is missing efficient overflow handling. It also has a function with efficient overflow handling,PyLong_AsLongAndOverflow
, but that returns along
.Here's what we want:
PyInt_AsInt32
PyInt_FromInt32
PyInt_AsSsize_t
PyInt_FromSsize_t
I'm using
PyInt
prefix, now that Python 2 is history. It makes it clearer what is the new API.Note that I'm not handling unsigned values. I think the extra bit of precision is not worth the complexity of a larger API. And if we decide that they are, we can always add them later.
Linked PRs