Closed rhettinger closed 1 month ago
Agree, this is a bug. But fixing it can break a user code. For example:
while True:
it, peek = tee(it)
if next(peek).isdigit():
it, num = parsenum(it)
continue
it, peek = tee(it)
if next(peek).isalpha():
it, word = parseword(it)
continue
it, peek = tee(it)
if next(peek) in '+-*/':
op = next(it)
continue
Without optimization, you will get a growing chain of the tee objects with linearly growing computation time and stack consumption of next()
. This may be the original purpose of this optimization, not only performance or memory use.
Nested tee()
calls self-flatten just like nested partial()
calls. So no, there would not be a "growing chain of the tee objects".
Without optimization, you will get a growing chain of the tee objects with linearly growing computation time and stack consumption of
next()
. This may be the original purpose of this optimization, not only performance or memory use.
teeobject
is copyable, the problem is that the tuple construction loop in tee_impl
only starts from 1. The 0-th element is a copyable iterator acquired by GetIter
, but teeobject
is self iterator AND copyable, which causes the bug here.
@rhettinger It seems that the re-use of outer_tee
isn't valid, since the doc said:
Once a tee() has been created, the original iterable should not be used anywhere else; otherwise, the iterable could get advanced without the tee objects being informed. Once a tee() has been created, the original iterable should not be used anywhere else; otherwise, the iterable could get advanced without the tee objects being informed.
This behavior was introduced in ad983e79d6f215235d205245c2599620e33cf719, and it was explicitly documented from the beginning, so it was considered as a feature or a fair price.
Nested
tee()
calls self-flatten just like nestedpartial()
calls. So no, there would not be a "growing chain of the tee objects".
It depends on your expectations. Do you expect tee(x for x in it)
and tee(it)
produce the same result? There are two shorcuts in the tee constructor -- omitting the tee_fromiterable()
call if the iterator is copyable and omitting copying the first item in a tuple. The first one allows to avoid the growing chain of the tee objects, but it goes against the published specification and the general expectations. It all does not matter if you do not use the original iterator.
BTW, your use of "outer" and "inner" for iterators is confusing. I would call them in the opposite way -- the nested one is inner.
Automatic backports failed so someone should take care of them manually.
GH-125081 is a backport of this pull request to the 3.13 branch.
GH-125153 is a backport of this pull request to the 3.12 branch.
There was something wrong with the backports -- @bedevere-app usually reports on the original PR page, not on the issue page.
Indeed, the topic of the backport PR should look like "[3.13] gh-{issue-number}: {text} (GH-{pr-number})". These PRs have the issue number in wrong place and do not have the original PR number.
Bug description:
To save a memory allocation, the code path for a tee-in-a-tee incorrectly reuses the outer tee object as the first tee object in the result tuple. This is incorrect. All tee objects in the result tuple should have the same behavior. They are supposed to be "n independent iterators". However, the first one is not independent and it has different behaviors from the others. This is an unfortunate side-effect of an early incorrect optimization. I've now seen this affect real code. It surprising, unhelpful, undocumented, and hard to debug.
Demonstration:
This outputs:
There is a test for the optimization -- it wasn't an accident. However, the optimization itself is a bug against the published specification in the docs and against general expectations.
Linked PRs