Closed Mariatta closed 3 months ago
the error message seems to suggest that you need to pass in an iterable object
This is in fact true:
def f():
yield 10
yield 20
yield 30
>>> 20 in f()
True
To be most helpful, the error message could mirror the actual logic. Perhaps something like this:
TypeError: Argument of type b does not support the "in" operator. It must define either __contains__ or __iter__.
...or __getitem__.
🙃
Right. __getitem__
is also used:
class A:
def __getitem__(self, i):
if i >= 10:
raise IndexError
return i ** 2
>>> 64 in A()
True
>>> 55 in A()
False
Interestingly, it not detected by the Container
ABC.
>>> from collections.abc import Container
>>> isinstance(A(), Container)
False
Note, the error message isn't produced directly by PySequence_Contains()
in Objects/abstract.c
. Instead, this is a downstream message returned by the step that attempts to iterate.
Changing the message would entail catching the exception, verifying the reason for it (we can get TypeError
for other reasons and don't want to eat those exceptions), and then raising a new exception with the new message. Here's the current code:
/* Return -1 if error; 1 if ob in seq; 0 if ob not in seq.
* Use sq_contains if possible, else defer to _PySequence_IterSearch().
*/
int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
PySequenceMethods *sqm = Py_TYPE(seq)->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL) {
int res = (*sqm->sq_contains)(seq, ob);
assert(_Py_CheckSlotResult(seq, "__contains__", res >= 0));
return res;
}
Py_ssize_t result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
In other words, this is more involved than a simple string edit.
I found the patch in GH-119888 laying around in one of my worktrees, and it seemed simple enough to me to go ahead and post it.
Done in 3.14 (GH-119888); probably not one to backport (but I'm happy to be overruled on that :))
Feature or enhancement
When using
in
to test containment, for exampleif "a" in b:
, if b does not support thein
operator, then it would raise an error message:TypeError: Argument of type b is not iterable
.To the reader/debugger of this code, the error message seems to suggest that you need to pass in an iterable object (an object that implements
__iter__
), but in reality, you can also pass in a container object (an object that implements__contains__
).It would be great if the error message can be improved and be more accurate and helpful.
Pitch
The
in
keyword can be used in different ways:for loop
You can do for loop when you gave an iterable/sequences, like lists, dicts, strings, objects that implements
__iter__
testing for containment
To make a class into a container, you just need to implement the
__contains__
method.If somehow we made a bad refactoring on this container class and removed/renamed the
__contains__
, and tried to use the same existing code"a" in b
, it would raise an error saying that b is not iterable.If the object was not an iterable to begin with, this error message is confusing to the person debugging this, and they would not realize that this is due to the missing
__contains__
method.I think it would be great if the error message when testing containment
if a in b
can be different than the error message when doing for loopfor a in b
. Providing more accurate error message will be helpful to the user.Example message:
TypeError: Argument of type '%.200s' is not a container
(and that this is only raised when doing
if a in b
)I tried to look into the CPython code, and it seems like the error message is coming from this line: https://github.com/python/cpython/blob/8a808952a61b4bd572d95f5efbff1680c59aa507/Objects/abstract.c#L2187 Which was introduced in https://github.com/python/cpython/pull/20537
Regarding the term container, it is used in this doc: https://docs.python.org/3/library/collections.abc.html#collections.abc.Container
Previous discussion
I don't know if you'd count Twitter thread as previous discussions, but here are some links: Start of thread
Supporting message that the error message can be improved: here and here
Comment about Python's terminology of container and iterable
Comment about Python oddity
Linked PRs