Closed dd5ce2ff-95a1-4e28-ace2-6d841dd59913 closed 6 years ago
Simple coroutine with for loop works:
>>> def pack_a():
while True:
L = []
for i in range(2):
L.append((yield))
print(L)
>>> pa = pack_a()
>>> next(pa)
>>> pa.send(1)
>>> pa.send(2)
[1, 2]
>>>
If using list comprehension (generator expression), it fails:
>>> def pack_b():
while True:
L = [(yield) for i in range(2)]
print(L)
>>> pb = pack_b()
<endless loop here>
I understand what's going on here - generator expression is converted to nested function and there's no way to either stop the execution inside nested function (since it's not started yet!) or send() a value to its yield expression. Still I think this behavior is a bug and needs fixed.
Hmm, what an interesting and unexpected side-effect of the efforts to hide the loop induction variable.
While the behavior is confusing, I don't think yield inside comprehensions should be disallowed. Rather, the fact that comprehensions have their own scope should be stated clearer.
I think I can probably fix it, but it's debatable whether it should be done, since it'd make list comps more of "quasi" functions.
This discussion should probably be moved to python-dev. With tools like Twisted's inlineDefer or the Monocle package, there is a growing need to be able to use yield in complex expressions. Yet, that goes against the trend toward making lists comps more like genexps and less like sugar for a simple for-loop accumulator.
Guido, do you have any thoughts on the subject? Mark it a documentation issue or try out Benjamin's fix?
I think it is definitely wrong the way it works in 3.x. (Especially since it works as expected in 2.x.)
I agree with Inyeol's preference of fixes: (1) make it work properly for listcomps as well as genexps, (2) if that's not possible, forbid yield in a genexp or listcomp.
Note that even though yield in a genexp could be considered as having a well-defined meaning, that meaning is not useful and I would consider it as merely a coincidence of the specification, not an intentional effect. So I would be fine changing its meaning. (My assumption is that since it is not useful there is -- almost -- no code depending on that meaning.)
PS. Wasn't there a similar issue with something inside a genexp that raises StopIteration? Did we ever solve that?
Isn't this the same issue as bpo-3267?
Yes it is, but I was never asked about it back then.
bpo-3267 did not expose endless loop possibility and was closed as won't fix. Rather than reopen that and close this and move nosy list back, I added to nosy list here.
FWIW, the "endless loop possibility" is not of issue here, and is simply an artifact of the specific generator function the OP uses.
I would like to add that since the introduction of asyncio module that heavily uses "yield from" syntax, binding of yield inside comprehensions/generator expressions could lead to unexpected results/confusing behavior. See for example this question on SO: http://stackoverflow.com/questions/29334054/why-am-i-getting-different-results-when-using-a-list-comprehension-with-coroutin
Is the fact that 'await' produces a syntax error in this context the same bug or a new one?
Is the fact that 'await' produces a syntax error in this context the same bug or a new one?
What kind of SyntaxError? await outside an async function is prohibited, bare await is also prohibited.
>>> async def foo():
... bar = [await x for x in range(10)]
File "<input>", line 2
SyntaxError: 'await' expressions in comprehensions are not supported
Python 3.5 does not support this, you should use Python 3.6 (plus await x will fail when you will run the coroutine, since you cannot await on int).
OK, cool. So, long term, there will be a way to do this (suspend within a generator expression). Thanks for the pointer.
(As far as awaiting on int, yes, I know how await works, I was focusing on the syntax.)
Just to add my comment to this 7-years-old never-resolved issue: in PyPy 3.5, which behaves like Python 3.x in this respect, I made the following constructions give a warning.
def wrong_listcomp():
return [(yield 42) for i in j]
def wrong_gencomp():
return ((yield 42) for i in j)
def wrong_dictcomp():
return {(yield 42):2 for i in j}
def wrong_setcomp():
return {(yield 42) for i in j}
SyntaxWarning: 'yield' inside a list or generator comprehension behaves unexpectedly (http://bugs.python.org/issue10544)
The motivation is that none of the constructions above gives the "expected" result. In more details:
wrong_listcomp() doesn't even return a list at all. It's possible to have a clue about why this occurs, but I would say that it is just plain wrong given the return [...]
part of the syntax. The same is true for wrong_dictcomp() and wrong_setcomp().
wrong_gencomp() returns a generator as expected. However, it is a generator that yields two elements for each i in j: first 42, and then whatever was send()
into the generator. I would say that it is in contradiction with the general idea that this syntax should give a generator that yields one item for each i in j. In fact, when the user writes such code he might be expecting the "yield" to apply to the function level instead of the genexpr level---but none of the functions above end up being themselves generators.
For completeness, I think there is no problem with "await" instead of "yield" in Python 3.6.
How about fixing CPython to raise SyntaxWarning or even SyntaxError?
How about fixing CPython to raise SyntaxWarning or even SyntaxError?
I think it is better to just fix the issue, i.e. make comprehensions be equivalent to for-loops even if they contain yield
. (In particular this will lead to [(yield i) for i in range(5)] be SyntaxError outside function).
The example of await
shows that it is possible without leaking the loop variable into enclosing scope.
Let's see if the discussion goes anywhere or if this issue remains in limbo for the next 7 years. In the meantime, if I may humbly make a suggestion: whether the final decision is to give SyntaxError or change the semantics, one or a few intermediate versions with a SyntaxWarning might be a good idea.
"Just fix the issue" is easier said than done for the same reason that comprehensions were implemented the way they are now: lambda expressions still have to work.
That is, we need to maintain the invariant that:
[x for x in iterable]
{x for x in iterable}
(k:v for k, v in iterable)
(x for x in iterable)
give the same results (respectively) as:
[(lambda: x)() for x in iterable]
{(lambda: x)() for x in iterable}
((lambda: k)():(lambda: v)() for k, v in iterable)
((lambda: x)() for x in iterable)
Once you work through the implications of "We need the loop variable to visible to lexically nested scopes, but invisible in the containing scope", you're going to end up with something that looks enough like a nested function that the easiest to implement and explain option is to have it *be* a nested function.
I'd be fine with a resolution that forbade yield expressions directly inside implicit scopes, though.
Also see https://bugs.python.org/issue1660500 for the original Python 3.0 change to hide the iteration variable.
While the test suite already covers some interesting scoping edge cases as result of that initial patch, I think one we're currently missing is the nested comprehension case:
>>> [[x for x in range(1, i)] for i in range(2, 5)]
[[1], [1, 2], [1, 2, 3]]
I see nothing special in [[x for x in range(1, i)] for i in range(2, 5)]
. This should be equivalent to:
__result = []; __i = None
try:
for __i in range(2, 5):
__result2 = []; __x = None
try:
for __x in range(1, __i)
__result2.append(__x)
__result.append(__result2)
finally:
del __result2, __x
finally:
del __i
Another slight variant to that test case to make sure the inner comprehension actually generates a closure reference in the current implementation:
>>> [[x+y for x in range(2) for y in range(1, i)] for i in range(2, 5)]
[[1, 2], [1, 2, 2, 3], [1, 2, 3, 2, 3, 4]]
These are the kind of challenges that drove of us towards the current implementation. While the status quo definitely has its downsides, those downsides at least all flow directly from "There's an implicit nested function there that you're not expecting".
Yury took all this into account when designing the interaction between await
and comprehensions (which is why that's in a better state), but for yield
we inherited the existing behaviour of any other nested function.
This is straightforward.
__result = []; __i = None
try:
for __i in range(2, 5):
__result2 = []; __x = __y = None
try:
for __x in range(2):
for __y in range(1, __i):
__result2.append(__x + __y)
__result.append(__result2)
finally:
del __result2, __x, __y
finally:
del __i
Sure, you can technically do that, but you'll have to rewrite most of the symtable pass in the compiler, since you won't be able to re-use the current recursive descent algorithms (which are designed to handle function scopes). That ran afoul of "If the implementation is hard to explain, it's a bad idea."
Where that gets particularly nasty is when you nest comprehensions with the *same* iteration variable name:
>>> [[i for i in range(2)] for i in range(2)]
[[0, 1], [0, 1]]
Now your name generator needs to be clever enough to account for how many levels deep it is. You also run into the lexical scoping problem anyway once a lambda expression shows up:
>>> [[(lambda i=i: i)() for i in range(2)] for i in range(2)]
[[0, 1], [0, 1]]
>>> [[(lambda: i)() for i in range(2)] for i in range(2)]
[[0, 1], [0, 1]]
As noted earlier, not impossible to resolve, but for from the "simple" option that folks seem to believe it is (and which I thought it would be before I actually tried to implement it while respecting Python's lexical scoping rules).
Nick,
Yury took all this into account when designing the interaction between
await
and comprehensions (which is why that's in a better state), but foryield
we inherited the existing behaviour of any other nested function.
Actually the fact that await
in comprehensions already works right is one of my main arguments that we can also fix yield
. So I am totally with Serhiy here.
Consider this as a sort of optimization. A function inlining. In general case we can't inline a function because it can be rebinded at any time, but in this particular case we call a just created function that doesn't have any references and live only on the stack a, and we have all information for generating an inlined code. This will allow to get rid of the overhead for creating and calling a one-time function.
The naming problem is the hardest problem. We should assign unique names for internal variables, like '.0.x'. We already do this for the single parameter of a one-time function. Or we could just use a stack (this requires more work but has additional benefits).
If you want to think about comprehensions as about calling inner functions, you can do this. And you can think about comprehensions as about loops which don't leak inner variables. The behavior should be absolutely identical until you will use "yield" which would turn your list producing function into a generator. If make "yield" turning into a generator not an implicit inner function, but an explicit enclosing function, a comprehension could be represented via an implicit inner function. Currently there is no way to express this in Python syntax, but before introducing "yield" there was no way to express its semantic.
I concur with Ivan. It would be nice to me to disallow "yield" in comprehensions and generator expressions, but the fact that "await" in comprehensions already works right, and this behavior is intentional and useful, make me thinking that "yield" should work in comprehensions.
Remind me what happens when you use await
in a generator expression that
survives the async function's scope?
async def f(xs):
return (await x for x in xs)
Remind me what happens when you use
await
in a generator expression that survives the async function's scope?
Awaiting on f([1, 2]) will result in an async generator (even though yield never appears here). Yury explained why this happens in https://bugs.python.org/issue32113
(Of course there should be not [1, 2] in the argument, but a list of some awaitables, otherwise there will be an error later.)
Yury explained why this happens in https://bugs.python.org/issue32113
It happens because '(x for x in xs)' creates a synchronous generator. So when there's an 'await' in it, it creates an asynchronous generator.
It's also important to note, that in 3.7, this is legal:
def foo():
return (await x for x in xs)
Note that 'foo' is a regular function, not 'async def'.
... but [await x for x in xs] is still valid _only_ inside async def.
... but [await x for x in xs] is still valid _only_ inside async def.
Yes, because it's computed right where it is defined.
a = [x for x in xs] # `a` is a list object
a = (x for x in xs) # `a` is a generator
Do you understand the difference?
Well, after all I am thinking maybe it is indeed makes sense to ban yield
inside both sync/async and both comprehensions/generator expressions. Since we already have a smörgåsbord of intuitive behaviors.
This, _together_ with extensive clarification to the docs and error messages can fix the problem.
I honestly think we went too far here. Asynchronous generators are still provisional. Though the PEP is silent about this, the acceptance notice is clear: https://mail.python.org/pipermail/python-dev/2016-September/146267.html. I propose to rip them out.
Well, after all I am thinking maybe it is indeed makes sense to ban
yield
inside both sync/async and both comprehensions/generator expressions.
I agree, as I can't imagine a real use case for
a = ((yield a) for a in as)
which is really equivalent to
def a():
for a in as:
yield (yield a)
Do you understand the difference?
Yury, sorry, but what is your problem? Have I said something about that this is bad. Your tone is clearly insulting, and this is not the first time. Maybe it makes sense to have some respect?
> Do you understand the difference?
Yury, sorry, but what is your problem? Have I said something about that this is bad. Your tone is clearly insulting, and this is not the first time. Maybe it makes sense to have some respect?
Sorry, I didn't mean to insult anybody. I asked an honest question with an intent to clarify if there's some misunderstanding of the topic that I'm partially responsible for in CPython.
Yury OK, sorry then this is a misunderstanding from my side.
Yury OK, sorry then this is a misunderstanding from my side.
NP. Again, sorry if I sounded that way to you.
Guido, I am not sure about the complete removal, this is probably to radical. What I think we are missing more detailed docs that would be clear about the corner cases. (I already mentioned this in https://bugs.python.org/issue32113)
I think we all need to calm down a bit. How about not posting about this topic for 24 hours.
How about not posting about this topic for 24 hours.
OK, makes sense :-)
Given the direction of the python-dev thread, should we split this question into two issues?
Issue 1: a yield expression inside a comprehension changes the type of the expression result (returning a generator-iterator instead of the expected container type)
Issue 2: a yield expression inside a generator expression interacts weirdly with the genexp's implicit yield expression
I ask, as it seems to me that issue 1 can be addressed by wrapping the affected cases in an implicit 'yield from' expression, which will both fix the return type of the expression and turn the outer function into a generator (if it isn't one already). (I'm going to put together a proof-of-concept for that idea this weekend)
By contrast, the interaction between generator expressions and explicit yield expressions seems intrinsically confusing, so I'm not sure we can do any better than declaring it a syntax error to try to combine them.
I realised that even without modifying the compiler first, I could illustrate the proposed yield from
based resolution for the comprehension case by way of explicit yield from clauses:
def get_gen_result(gen, inputs):
try:
yield_value = next(gen)
for send_value in inputs:
print(f"Received: {yield_value}; Sending: {send_value}")
yield_value = gen.send(send_value)
except StopIteration as exc:
return exc.value
raise RuntimeError("Failed to exhaust generator")
def example():
comp1 = yield from [str((yield x)) for x in ('1st', '2nd')]
comp2 = yield from [int((yield x)) for x in ('3rd', '4th')]
return comp1, comp2
>>> result = get_gen_result(example(), range(4))
Received: 1st; Sending: 0
Received: 2nd; Sending: 1
Received: 3rd; Sending: 2
Received: 4th; Sending: 3
>>> result
(['0', '1'], [2, 3])
So if we decided to make yield-in-a-comprehension imply the use of yield from, we'd only need:
No to both. See python-dev.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['interpreter-core', 'type-bug', '3.7']
title = 'yield expression inside generator expression does nothing'
updated_at =
user = 'https://bugs.python.org/InyeolLee'
```
bugs.python.org fields:
```python
activity =
actor = 'serhiy.storchaka'
assignee = 'none'
closed = True
closed_date =
closer = 'serhiy.storchaka'
components = ['Interpreter Core']
creation =
creator = 'Inyeol.Lee'
dependencies = []
files = ['47296']
hgrepos = []
issue_num = 10544
keywords = ['patch']
message_count = 66.0
messages = ['122475', '122512', '122645', '122658', '122705', '122759', '122760', '122762', '122765', '123641', '123649', '241410', '286273', '286274', '286280', '286281', '286282', '286283', '286407', '286408', '286409', '306784', '306802', '306804', '306805', '306806', '306807', '306808', '306810', '306812', '306829', '306831', '306832', '306833', '306834', '306835', '306836', '306837', '306838', '306840', '306841', '306842', '306843', '306844', '306845', '306846', '306847', '306939', '306940', '306963', '306964', '306983', '306995', '307013', '307033', '307035', '307039', '307051', '307067', '307091', '307095', '307359', '307361', '307426', '307449', '311593']
nosy_count = 15.0
nosy_names = ['gvanrossum', 'georg.brandl', 'rhettinger', 'ncoghlan', 'belopolsky', 'benjamin.peterson', 'erickt', 'glyph', 'Inyeol.Lee', 'serhiy.storchaka', 'yselivanov', 'esc24', 'danielsh', 'emptysquare', 'Jim Fasarakis-Hilliard']
pr_nums = ['4564', '4579', '4676']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue10544'
versions = ['Python 3.7']
```