capi-workgroup / api-evolution

Issues related to evolution of the C API
14 stars 1 forks source link

Conversion to unsigned integers #49

Open serhiy-storchaka opened 3 months ago

serhiy-storchaka commented 3 months ago

There are three main groups of conversion to C integer types:

  1. Conversion to a signed integer type with overflow checking
  2. Conversion to an unsigned integer type with overflow checking
  3. Conversion to an unsigned integer type without overflow checking

In conversion to a signed integer type, an OverflowError is raised if the value outside of the range of the signed integer type. It is OK, as the limits are usually an implementation detail.

In conversion to an unsigned integer type with overflow checking, an OverflowError is usually raised if the value outside of the range of the signed integer type. But in some cases a ValueError is raised for negative value, and OverflowError is raised for large value exceeding the implementation defined limit. I propose to always raise ValueError when negative value is converted to an unsigned integer type. This condition is usually not an implementation detail: a size cannot be negative, independently of a platform. I opened an issue about this in 2017, but @rhettinger was against this, so this froze. But a new PyLong_AsNativeBytes functions raises ValueError for negative values, as well as many private converters used in Argument Clinic, so I decided to revive this issue. Today arguments in favor of it are stronger.

The last group is the most dangerous. Silent ignoring an overflow can leads to bugs. Unfortunately this is the default behavior of PyArg_Parse format units for unsigned integers, and some of PyMemberDef types. There are use cases for this:

To mitigate possible harm of such API, I propose to limit it too. For example, PyLong_AsUnsignedMask() should raise OverflowError if the value is larger than ULONG_MAX or less than some negative value (LONG_MIN or even smaller, like INT_MIN, or -256, or -1).

Here are my two ideas.

serhiy-storchaka commented 3 months ago

See also https://discuss.python.org/t/conversion-between-python-and-c-integers/44117/2, I may repeat myself here.

encukou commented 3 months ago

Which functions are you proposing to change?

I think it's too late to change the types of exceptions raised by existing functions. It's not worth breaking existing code.

But let's make new API is better. And perhaps also add documentation warnings to API that would work differently if added now.

serhiy-storchaka commented 3 months ago
  1. Raise ValueError for negative values: PyLong_AsUnsignedLong(), PyLong_AsUnsignedLongLong(), PyLong_AsSize_t() and the "b" format unit in PyArg_Parse(). See https://github.com/python/cpython/pull/121114
  2. Raise a warning for truncated values: PyLong_AsUnsignedLongMask(), PyLong_AsUnsignedLongLongMask(), and format units "B", "H", "I", "k", "K" in PyArg_Parse().