Open rhettinger opened 2 years ago
A related issue is that len
calls are frustratingly inefficient for pure-Python __len__
functions.
o.__class__.__len__(o)
, which returns an arbitrary object._PyNumber_Index
on the result, which returns a Python int
subclass.PyNumber_AsSsize_t
on the result, which calls _PyNumber_Index
again and returns a C Py_ssize_t
.int
using PyLong_FromSsize_t
.In the vast majority of cases (__len__
returns an exact int
), we still end up unboxing and reboxing the value just to cross API boundaries, which is the source of the range
problem here.
A len
call where __len__
returns an exact int
should, ideally, perform the super-cheap negative value check and nothing more. It would be great if whatever solution we come up with here is able to improve this situation as well.
Good point Brandt. It would be great if we could address both of these issues at the same time.
IIRC, Guido said that len()
and isinstance()
are the two most commonly called functions in Python.
As a starting point, I like the interface of nb_index
. It is expected to return an exact int
.
Quarter-baked idea: add sq_py_length
and mp_py_length
slots. They are expected to return nonnegative int
objects.
sq_length
/mp_length
and convert the results to int
.sq_length
and mp_length
call the new methods, which wrap __len__
.len
(and __len__
slot wrappers, and the GET_LEN
opcode, and anything else that wants an int
) will always use the "new" int
-returning slots.range
can define both the sq_length
(for speed) and sq_py_length
(for len
/__len__
).As an alternative, we could also just have len
start calling __len__
directly. This is easier to explain, but likely comes with non-negligible overhead when called on types defined in C (probably the common case).
I was originally thinking of leaving PyObject_Size()
and the existing slots as they are now. Instead, add PyObject * PyObjectLen(PyObject *obj)
to be called by the builtin len()
function. It would first call PyObject_Size()
but then have a fallback that would have a special case for range objects and would invoke __len__
when defined outside of the slot. The latter would work for any pure Python function. Types implemented in C could add METH_COEXIST
if they wanted a method that could be called directly and return a Python object.
Adding new slots would work as well and that might be cleaner. Historically, we've had a strong resistance to adding new slots, but the shared mentality and values are changing.
Hi, thanks for working on this! Has a decision been made on this proposal? Are there plans to include the fix in future releases?
Note that this wasn't a compatibility problem when range()
was converted to a virtual sequence in Py3 since the Py2 range()
builtin would just die outright for sequences that long (lists larger than maxsize necessarily won't fit in RAM), and iterrange()
didn't support __len__
at all. The reported restriction is real, but the workaround at affected call sites is also straightforward: call r.__len__()
when len(r)
fails on the conversion to ssize_t
.
This means the random
module limitations could potentially be fixed without any length protocol or len()
builtin enhancements by adding a fallback to call r.__len__()
directly if len(r)
fails with OverflowError
.
@rhettinger's proposal would essentially be standardising that workaround as part of len()
, providing a convenient C API for it, and also an optimised path specifically for range()
objects (all of which seem like reasonable suggestions to me).
It's also worth noting that using this fallback-on-overflow approach initially wouldn't preclude enhancing the slot level protocol later. Both len()
and the suggested C API would still work, they would just become more efficient internally (with benchmark performance deciding whether the extra slot protocol complexity was worth it).
Problem
Pure Python
__len__
methods can return values larger thansys.maxsize
:However, the builtin
len()
function unnecessarily fails:The builtin
range()
type added support for ranges larger thansys.maxsize
. Larger indices work; negative indices work; forward iteration works; reverse iteration works; and access to the attributes work:However,
len()
unnecessarily fails:The
random.sample()
andrandom.choice()
functions both depend on the builtinlen()
function, so they unnecessarily fail when used with large range objects or with large user defined sequence objects. Users have reported this issue on multiple occasions. We closed those issues because there was no practical way to fix them short of repairing the builtinlen()
function:Proposal
Make the builtin
len()
function smarter. Let it continue to first try the CPyObject_Size()
function which is restricted toPy_ssize_t
. Now add two new fallbacks, one for range objects and the other for calling the__len__
method though the C API allowing arbitrary objects to be returned.Rough sketch:
Bug or Feature Request
Traditionally, extending support for sizes beyond
Py_ssize_t
has been considered a new feature,range()
anditertools.count()
for example.In this case though, arguably it is a bug because the
range()
support was only 90% complete, leaving off the ability to calllen()
. Also it could be considered a bug because users could always write a__len__
method returning values larger thanPy_ssize_t
and could access that value withobj.__len__
but thelen()
function inexplicably failed due to an unnecessary and implementation dependent range restriction.Other other thought:
maxsize
varies across builds, so it is easily possible to get code tested and working on one Python and have it fail on another. All 32-bits builds are affected and all Windows builds.It would be easy for us to remove the artificial limitation for range objects and for objects that define
__len__
directly rather than throughsq_length
ormp_length
. That includes all pure Python classes and any C classes that want to support large lengths.