Closed serhiy-storchaka closed 1 month ago
My vote goes to signed, so the sign bit can serve its usual purpose as an error indicator.
We can reserve (uint64_t)-1
for error indication. Actually, I already reserved it to avoid integer overflow in _PyLong_Frexp()
.
Implementation for signed count: https://github.com/python/cpython/pull/123724
It is slightly more complex than the unsigned variant, because most of existing code already use uint64_t. If we even made _PyLong_Rshift()
public, we will need to replace assert(shiftby >= 0)
with a runtime check.
We can also use INT64_MAX instead of UINT64_MAX as the limit with the uint64_t
type. Then the code can use uint64_t
or int64_t
without integer overflow.
Using signed int64_t
would be consistent with signed Py_ssize_t
used basically everywhere in the Python C API.
I don't think that INT64_MAX
bits is too small: such int
number should be almost impossible to create or use in practice on nowadays computers.
Merged the signed version.
Previous attempt to add a C API for
int.bit_length()
failed because in most cases the user can usePyLong_AsNativeBytes()
(see https://github.com/capi-workgroup/decisions/issues/28). I still think that there are use cases for exact bits counting, so we may add such API in future. There are also other private C API functions that work with bits counts and may be exposed to public in future.The main problem to me with this API was that it returns the bits count as
size_t
orPy_ssize_t
. On 32-bit platform it is not difficult to create an integer or even several integers whose bits count does not fit insize_t
orPy_ssize_t
(This is only 0.5 or 0.25 GB), so this is a practical issue. I recently solved this by making these functions always using 64-bit integers, even on 32-bit platforms. And we can solve it on 64-bit platforms by forbidding creation of Python integers whose bits count does not fit in 64 bits (see https://discuss.python.org/t/imposing-a-hard-limit-on-python-integers/62302/10). There is no much practical difference between $2^{63}$, $2^{64}$ and $2^{66}$ bits -- all limits are insanely large and not achievable on practice. This makes_PyLong_NumBits()
and_PyLong_Frexp()
always successful.There is still one implementation question: should we use signed or unsigned 64-bit integer type for bits count? It was a very practical question for 32-bit counts, but for 64-bit counts both $2^{63}$ and $2^{64}$ are very large, so no need to squeeze a bit. On other hand, the bits count is always non-negative, there is no use case for negative values.
This will affect private C API functions
_PyLong_NumBits()
,_PyLong_Frexp()
,_PyLong_Rshift()
and_PyLong_Lshift()
.Implementation for unsigned count: https://github.com/python/cpython/pull/123498.