Open njsmith opened 7 years ago
Some discussion on python-ideas (scroll down to point 9): https://mail.python.org/pipermail/python-ideas/2017-August/046736.html
Notes:
PEP 568 provides one possible path for making cancel scopes in (async) generators "just work"; discussing this is a todo for python 3.8.
There's some discussion that's relevant to this issue starting at https://github.com/python-trio/trio/issues/265#issuecomment-370509223
One idea for nurseries-in-generators is to provide an ergonomic way to get a context-managed generator. I wrote an example implementation up at https://gist.github.com/oremanj/772241665edea3ceb3183a4924aecc08 (it was originally intended to be a quick-and-dirty demo, but rather grew a life of its own...) I think something like this would likely be useful in a trio utilities library. It doesn't require any special support from trio or async_generator and is rather large/magical so I'm not sure it deserves to make it into trio itself, unless trio wants to use it.
There are still notable limitations; for example, in the azip() example, if one of the inner slow_counters reaches its deadline and is cancelled, the whole azip() will be cancelled, because we had to enter the slow_counter contexts to get async iterables that could be passed to the azip() context. But if the limitations are acceptable, cancel scopes and nurseries that are attached to a generator in this fashion should be guaranteed to nest correctly / not mess up the cancel stack.
Thinking about this a bit more...
PEP 568, if implemented, would make it possible to assign a reasonable semantics to cancel scopes with yield
inside them. Basically, the scope would apply to the code that's statically inside the scope, but not to anything outside the generator; if the scope is cancelled/timeout expires while the generator is suspended, then that just means that when the generator is resumed, it'll be in a cancelled state and the next checkpoint will raise Cancelled
. (Or if it exits the cancel scope without executing a checkpoint, that's fine too.)
PEP 568 would not make it possible to assign a reasonable semantics to a nursery with a yield inside it – that's intrinsically impossible if we want to keep the guarantee that exceptions in background tasks are propagated promptly, because if a background task crashes while the generator-holding-the-nursery is suspended, what can you do? There's no live stack frame to propagate it into. If we were willing to relax the guarantee to just, exceptions are propagated eventually, and PEP 533 were implemented, then we could allow yield
inside nurseries, but neither of those seems likely.
So, maybe we want to we forbid yield
ing out of nurseries. How can we do that? One idea would be to have PEP 568, use it for cancel scopes, and then for nurseries add an additional feature: the ability to somehow "pin" the Context that's on top of the stack (the one that writes go to), so that if a yield
would take us out of that context, then it fails (raises an exception instead of yielding). Then regular cancel scopes would stash their state in the context, and nurseries would additionally pin the context. And inside an @asynccontextmanager
function you could still yield
out of a nursery, because @asynccontextmanager
functions – unlike other async generators – share their context with their caller, so yield
there doesn't touch the context stack.
One more thing to consider: what if we decide to disallow yield
inside all cancel scopes, not just nurseries? Even the correct semantics for a yield
inside a cancel scope are a little confusing; if the timeout expires while the generator is suspended, then nothing happens, until later it's resumed and then boom. Users could conceivably be expecting a more complex semantics, where the timeout clock stops ticking during the yield
. Or even if they aren't, then it's pretty weird for the semantics of the generator to depend on how long it was suspended for – that violates encapsulation, and probably isn't very useful.
If that's what we want to do, then we don't need to tie to it PEP 568-style nested contexts. The reason why tying them together is attractive is that PEP 568-style nested contexts need to have a special case to handle @contextmanager
and @asynccontextmanager
, and the no-yield-in-nursery rules need the exact same special case, so maybe we can combine the special cases. But we can also imagine adding the no-yield-in-nursery feature more directly, and decoupling the whole thing from PEP 568.
I guess what we'd want is: a way to set a flag on a caller's frame that makes it an error to attempt to yield
out of that frame (but await
is fine). And by "caller" I really mean "walk up the frame stack until we find the first frame that is not an @(async)contextmanager
, or quit early if we encounter a frame that's not an (async) generator". On CPython this would actually be ... pretty easy to implement, and efficient. (Except that we might need some tweaks for the bytecode eval loop to distinguish between the different kinds of yield
/await
.) The PyPy folks might hate it; should check with them. Generators are a bit special AFAIK (you have to materialize their frames for suspension to work), so maybe it's not so bad?
Our options are:
yield
in cancel scopes and nurseries
yield
in cancel scopes, but not nurseries
yield
in both cancel scopes and nurseries
PEP 533 is still how Python ought to work, but may not be the right hill to die on...
walk up the frame stack until we find the first frame that is not an
@(async)contextmanager
, or quit early if we encounter a frame that's not an (async) generator
Actually, that's not quite right...
class CM1:
def __enter__(self):
self._cm2 = CM2()
self._cm2.__enter__()
class CM2:
def __enter__(self):
disallow_yield()
with CM1():
yield # should be an error
So we really would want something like the PEP 568 semantics... like a stack of bools (or rather ints, to allow inc/dec, so disallow_yield
s can be nested), where the top of the stack tracks the allowed-to-yield state, entering an generator pushes to the stack, leaving pops, the push/pop is disabled on context manager generators, and the stack is automatically saved/restored when we suspend/resume coroutines, just like the exception stack. If we decide we want PEP 568 anyway, this could even be a ContextVar
. Though the "pinning" idea has the advantage that copy_context
wouldn't copy the disallow-yield state.
Eh, we have other stacks in the interpreter state, like the exception stack, it wouldn't be too weird to add another. And that way of formulating it wouldn't be a problem for PyPy.
Example of a case where having the disallow-yield state be stored in the Context
would be bad (especially if it's preserved across copy_context
): the kluges pytest-trio uses to share a Context
between fixtures and the test. See: https://github.com/python-trio/pytest-trio/issues/59
Frightening thought: we did a tremendous amount of work in https://github.com/python-trio/pytest-trio/issues/55 to allow pytest-trio fixtures to yield
inside nursery blocks. That work in particular is OK, because whatever hack we use to make @contextmanager
work can also be applied to @pytest.fixture
. (And it'll be a hack, but probably OK; the @pytest.fixture
code does have some disturbingly detailed knowledge of why exactly yield
is not OK and works around it, so I guess if other libraries started using disallow-yield for other reasons that could be a problem, but probably it's fine.) That's not the frightening thought.
The frightening thought is: well, why did we do all that work? It was because we decided it's not reasonable to expect people to know whether a random async context manager like async with open_websocket
actually has a nursery inside it – that should be abstracted away, an internal implementation detail. But...... it can't be, because if you can't use nurseries inside random generators, then you can't use async with open_websocket
inside random generators, and that means "does this context manager have a nursery inside it?" is actually a publicly visible feature of your API, and adding a nursery to an existing context manager may be a backwards-compatibility-breaking change :-( :-( :-(
A small problem with PEP 568: the way @contextmanager
/@asyncontextmanager
trigger the special case in the generators they call, is by setting an attribute on the generator.
But in the time since I've written that, I've learned more about all the ways this is an unreliable thing to try to do, in a world where people love their decorators. It's fine of course if @contextmanager
directly sees the generator, but it means we'd get weird subtle bugs if someone does:
def noop_wrap(fn):
def wrapped(*args, **kwargs):
return fn(*args, **kwargs)
return wrapped
@contextmanager
@noop_wrap
def agen_fn():
...
because @contextmanager
ends up writing to an attribute on the wrapped
object, not the agen_fn
object, so it has no effect.
Instead, the special case should be triggered by an optional argument passed to send
/throw
/close
.
Instead of a stack of bools or ints, maybe it should be a stack of str objects describingwhy yield
is forbidden, so we can give a good error message and track down bugs. Then instead of having "toggle the top value in the stack", we'd have "push/pop the stack", and entering a generator would (by default) push a sentinel value that indicates yielding is allowed.
A few more notes based on further thought:
This comment is pretty confused – we'd be setting the attribute on the generator object, not on the function, and wrapping generator objects is both very rare; and if someone did wrap a generator object, putting the argument on send
wouldn't necessarily help. Yury likes the attribute-on-generator-object approach better.
We don't really need a stack at all, just a thread-local flag that generator.send
and friends set on entry and restore on exit.
From what I've read in this comment, it seems like the idea of:
is out of the picture. I was curious about that so I've been experimenting a bit and I came up with this write-up explaining why it might conceptually make sense.
I've tried to find and read most of the related posts in issue #264 and #638 but I might have missed the explanation I was looking for. Anyway, I'm curious to hear what you guys think about it and I hope I'm not missing something crucial.
There's an initial proposal for making it possible to disallow yield
here: https://discuss.python.org/t/preventing-yield-inside-certain-context-managers/1091
@vxgmichel The big issue with that approach for me is that I don't understand it :-). Basically, what you say in the "Limitations" section at the end. In code like this:
async with aitercontext(agen()) as safe_agen:
async for item in safe_agen:
... # block 1
... # block 2
I find it really hard to reason about what happens with one of the two ...
code blocks raises an exception. It's pretty surprising that an exception in one of these two places could be caught by the code in agen
! It's also surprising that the code in "block 2" could be cancelled by code in agen
.
Maybe it could technically work? But I have no idea how to explain it to beginners.
@njsmith
The big issue with that approach for me is that I don't understand it :-)
I haven't thought about it in month, so it was quite a struggle to get back into it too :sweat_smile:
I find it really hard to reason about what happens with one of the two ... code blocks raises an exception. It's pretty surprising that an exception in one of these two places could be caught by the code in agen!
I think block 1 is fine, I would just reason about it the same way we reason about async context manager. This way, it doesn't seem absurd (to me at least) that agen might have to deal with an exception raised by its consumer. It acts like the consumer notifying the producing generator that it should stop, clean up and return. Much like an async context manager, it is not suppose to yield new values after an exception has been raised (or an aclose
has been issued).
It's also surprising that the code in "block 2" could be cancelled by code in agen.
Block 2 and block 0 (in between the async with
and the async for
) are the biggest issue here, they really shouldn't exist at all. The problem is that I couldn't find any trick or syntax to ban those sections without something like PEP 533.
There's an initial proposal for making it possible to disallow yield here
Interesting! That makes me realize that the generators I described in the write-up should be treated as a specific type of async generator. In the proposal there is the following example about async context manager:
@asynccontextmanager
async def open_websocket(...):
async with open_nursery():
yield Websocket(...) # This yield is OK
We could also imagine to allow yield statements for "producers" (I couldn't find a better term):
@asyncproducer
async def read_websocket(...):
async with open_nursery():
async for message in Websocket(...):
yield message # This yield is OK
Assuming PEP 533 and __aiterclose__
being implemented properly, the producer could be used as follow:
async for message in read_websocket(...):
print(message)
break # or raise
# The producer has been properly shutdown by now
I ran into part of this issue in a simple sync case, and wondered about handling it in a linter by requiring
try:
yield
finally:
...
I thought I'd link here the latest idea from @njsmith to fix this issue in CPython: Preventing yield
inside certain context managers
@vxgmichel There's a closely related discussion in #638, starting from this comment: https://github.com/python-trio/trio/issues/638#issuecomment-431649415
There we're talking about running generators as a separate task that pushes values into a channel, and the main task iterates over this channel instead of iterating over the agen directly.
I guess if you really think about it carefully, this is... probably equivalent to your suggestion, or mostly-equivalent, in terms of the final behavior? If the background task crashes, this cancels the looping code, or vice-versa, etc. But I find the generator-as-separate-task version a lot easier to think about, because it reuses normal trio concepts like tasks and channels.
@apnewberry I don't think that issue is the same as this one, though I could be missing something. The issue here is all about yield
inside two specific with
blocks, and one of the nice things about with
blocks is that you can't forget the finally
– if you use the with
block at all, then the finally
comes along automatically :-)
@njsmith
There's a closely related discussion in #638, starting from this comment: #638 (comment)
Yes, I actually commented on the corresponding gist (that wasn't a very good idea since there's no notification system for gist comments).
I guess if you really think about it carefully, this is... probably equivalent to your suggestion, or mostly-equivalent, in terms of the final behavior?
I would argue that both approaches are useful but not equivalent. The channel-based approach is conceptually similar to go-style generators with the producer and consumer tasks running concurrently, occasionally handshaking over a channel. On the other hand, the native generator approach is running a single task, executing either producer code or consumer code.
In the first case, an unused item might get lost in the channel when the consumer decides it had enough. In the second case, an item is only produced when the consumer asks for it, so no item is lost. This might or might not be important depending on the use case.
That's true, even with an unbuffered channel the producer does start on producing the next object as soon as the previous one is consumed.
I guess there's no technical reason you couldn't have a channel with a buffer size of "-1", where send
doesn't just wait for the sent item to be received, but waits for receive to be called on the item after that.
@njsmith That would still produce a possibly-lost first item. IMHO for lock-step non-parallelism you don't need a queue, you need a simple procedure call to the producer.
@smurfix Yeah, you also need some way to wait for that first receive
call, at a minimum. I guess one approach would be: add a wait_send_might_not_block
, and then something like @oremanj's @producer
decorator could call it before each time it steps the generator.
Of course, I was kind of hoping that we could get rid of Stream.wait_send_all_might_not_block
– #371 – and adding Channel.wait_send_might_not_block
would be going the other way. But sometimes things don't turn out the way I hope.
This would also have some weird edge cases.
If the consumer cancels their receive
, the producer would keep going on calculating the value to send. This makes sense but is different from a classic async generator. (It also means that cancelling a receive
is recoverable, while cancelling __anext__
destroys the generator object.)
Weirder: if the producer sets a timeout, it could leak out and cause the harness's call to wait_send_might_not_block
to exit early, and start the next iteration early.
All in all it might turn out that it's just not worth going to any heroic lengths to try to make the background-generator style match the inline-generator style. They're not quite the same, but in practice the simple version of background generators might work fine.
It also means that cancelling a receive is recoverable, while cancelling anext destroys the generator object.
This is happening to me, and is surprising and unwanted. I'm implementing the guts of a channel in an async generator. The generator's control flow is its state, as opposed to using a state machine. If someone calling channel.receive()
times out before the generator yields a value, I expect the generator to be suspended until the next call to __aiter__()
rather than for the entire generator to be destroyed.
The only way around this I can see so far is to wrap every await
inside the generator, wrap it with a try/except, yield some "I was cancelled" value (I'm using the trio.Cancelled
exception object) instead, adjust control flow to "retry" the await
inside the generator and have a wrapper reraise the exception from outside. But this still doesn't work when the generator is awaiting a result from some other async function because the "restart" would be in the wrong place.
Or I can implement a state machine, but that would seem to defeat the point of async.
Any suggestions please? This isn't exactly a cancel scope inside a generator so is perhaps off-topic, but your statement is the closest existing discussion on my issue that I can find. I can create a separate issue if you prefer?
I'm implementing the guts of a channel in an async generator. The generator's control flow is its state, as opposed to using a state machine. If someone calling
channel.receive()
times out before the generator yields a value, I expect the generator to be suspended until the next call to__aiter__()
rather than for the entire generator to be destroyed.
I'm assuming by __aiter__
here you mean __anext__
. Your desire is very reasonable, but unfortunately not compatible with Python generator semantics. A generator can only be suspended at a yield
. By definition, when you cancel it, it's running but waiting on some async operation, not at a yield
. Thus, the only options for responding to a cancellation are to either soldier on until the next yield
or unwind entirely by throwing an exception. The latter case is the default behavior. The former case can be obtained by putting a shielded cancel scope around each call to __anext__
, but then you lose all the benefits of cancellation.
By far the easiest approach is to push the async generator logic into a background task which communicates the values to be yielded over a memory channel. That way, you can cancel waiting for the value without cancelling obtaining the value. There's an adapter I wrote in #638 that lets you do this still with async generator syntax: https://github.com/python-trio/trio/issues/638#issuecomment-431954073
The remainder of this comment is for the case where you really don't want to do that, for whatever reason. Be warned, it gets gnarly. :-)
If you want the semantics where cancelling the generator will cause it to resume whatever async operation it was working on when it was cancelled, you have to write it yourself, in a way that contains a yield
statement. As you've discovered, your options are not very abstraction-friendly, because Python native async generators don't support yield from
.
But... there is an evil brittle hack that you can use to suspend an async generator from within an async function that it calls:
import types, inspect
async def wrapper():
holder = [None]
while True:
holder.append((yield holder.pop()))
co = wrapper.__code__
# on 3.8+:
wrapper.__code__ = co.replace(co_flags=co.co_flags | inspect.CO_COROUTINE)
# equivalent on pre-3.8, doesn't work on 3.8+:
#wrapper.__code__ = types.CodeType(co.co_argcount, co.co_kwonlyargcount, co.co_nlocals, co.co_stacksize, co.co_flags | inspect.CO_COROUTINE, co.co_code, co.co_consts, co.co_names, co.co_varnames, co.co_filename, co.co_name, co.co_firstlineno, co.co_lnotab, co.co_freevars, co.co_cellvars)
wrapper = wrapper()
wrapper.send(None)
@types.coroutine
def yield_(value):
yield wrapper.send(value)
Then await yield_(foo)
works like yield foo
, but you can call it from an async function that the async generator calls, not just from the async generator directly. See https://github.com/python-trio/async_generator/pull/16 for more on this monstrosity. It works on current versions of both CPython and PyPy, but it might crash on PyPy if you use debugging tools in certain ways, and it's very tied to implementation details that might change in the future.
If you don't like evil brittle hacks, you can use the @async_generator
backport package for your async generator instead of making it a native one. Its await yield_()
also supports this sort of thing (but only for its own async generators, not for the native ones).
Whichever approach you take, you can now write an abstraction like you've described:
class MyResilientAsyncGenerator:
def __init__(self, *args, **kw):
self._aiter = self._logic(*args, **kw)
def __aiter__(self):
return self
async def __anext__(self):
with trio.CancelScope(shield=True) as self._scope:
value = await self._aiter.__anext__()
if isinstance(value, trio.Cancelled):
raise value
else:
return value
async def aclose(self):
await self._aiter.aclose()
async def _restart_point(self, afn, *args, **kw):
while True:
self._scope.shield, prev_shield = False, self._scope.shield
try:
return await afn(*args, **kw)
except trio.Cancelled as exc:
await yield_(exc)
finally:
self._scope.shield = prev_shield
async def _logic(self):
# Your previous asyncgen goes here.
# Use 'await self._restart_point(fn, *args)' instead of 'await fn(*args)' for the
# cancellable-retry-on-resume operations. All other operations are uncancellable.
pass
This is untested. If you wind up using it, I'd be curious for feedback!
So here's a surprising thing that can happen in trio right now: if you have a generator (or async generator, doesn't matter) that yields inside a cancel scope, then the cancel scope remains in effect until you resume the generator and exit it.
For context managers, this is actually what you want, and we make use of this in lots of places, e.g.:
So here we have: first
fail_at
runs for a bit, and it starts a cancel scope, and then it yields. And then the XX code runs with the cancel scope in effect, even thoughfailt_at
is not on the stack. And thenfail_at
is resumed, and it exits the cancel scope. A little confusing when you think about the details, but basically this is safe and useful and ultimately does what you'd expect.However, in general this can cause strange behavior, and also crashes:
Notice that even after we leave the loop and abandon the generator, the cancel scope is still in effect... and then the program crashes because trio detects that the cancel scope stack has become inconsistent (it tries to pop a scope that isn't at the top of the stack).
This all also applies to using nurseries inside async generators, because nurseries have an implicit cancel scope.
There are also some more generic issues with cleaning up (async) generators that call trio APIs after yielding, but that's another issue (specifically, issue #265). The cancel scope leakage causes problems even while the generator is alive, and even if we had PEP 533 then this would still be an issue.
Really the ideal solution here would be to error out if someone tries to
yield
through a cancel scope, IFF they're not implementing a context manager.I don't think there's any way to do this though without PEP 521, and even then we'd have to figure out somehow whether this particular generator is implementing a context manager. (I guess some annoying thread-local thing could work -- e.g. make
with trio.this_is_a_context_manager: ...
which sets some internal flag that disables the PEP 521-based checks.)I don't currently have any other ideas how to improve this, except to put some prominent warning in the documentation.