Open njsmith opened 7 years ago
OK, I think I see your argument now, and given the leading await
you should be able to make it work even within the constraints of the LL(1) parsing restriction (it isn't substantially different from def ...(..., ..., ...)
in that regard, especially if you restrict the intervening subexpression to being a name lookup).
That said, while it doesn't read as well, you could likely more easily experiment with an await from foo(1, 2, 3)
formulation that switched on the from
keyword, rather than the await
subexpression being a call construct.
One the features I most like about Python is its flexibility. I would not want Python to be modified in a way that forces me to put an await on every instantiation of a coroutine. Specifically, I want this to work:
extracted = foo()
...
await extracted
@dabeaz @njsmith isn't proposing breaking that, he's just proposing to have it mean something different from await foo()
, just as extracted = 1, 2, 3; foo(extracted)
already means something slightly different from foo(1, 2, 3)
.
Extending the analogy to other forms:
foo(1, 2, 3)
can be expanded as extracted = 1, 2, 3; foo(*extracted)
extracted = 1, 2, 3; foo(extracted)
is actually equivalent to foo((1, 2, 3))
Comparable equivalents for @njsmith's proposal might then look like:
await foo(1, 2, 3)
could be expanded as extracted = partial(foo, 1, 2, 3); await from extracted
extracted = foo(1, 2, 3); await extracted
would become equivalent to await (foo(1, 2, 3))
And putting it that way, I do think the right place to start is to figure out what the "arg expansion" equivalent for @njsmith's proposal would actually look like, and I think await from expr
with a suitably updated functools.partial
implementation is a promising candidate (due to the yield from
precedent).
I see that my plan of leaving those notes at the bottom of this thread so I can find them again easily when I revisit this next year is not happening :-)
@dabeaz: even in my ideal world, the only thing that code would have to change is that you'd write extracted = foo.__acall__()
(or however we ended up spelling it) to explicitly signal that you intentionally left off the await
and really do want the coroutine object rather than the result of calling foo
. But the proposal described above doesn't even go that far: all it proposes is to let people writing async functions opt in to requiring the explicit .__acall__()
if they want, so it wouldn't affect you at all unless you decided to add this to curio as a feature. (And I guess if it's implemented and is wildly successful then we might get consensus to do a long deprecation period and then switch the default to requiring the explicit __acall__
, and even then libraries could still opt-out. But I don't think you need to worry about that right now!)
Really, the proposal is just to provide a simple and reliable way to let async functions know whether or not they got called with await
, and then do whatever they want with this information. It's actually more flexible than what we have now, and tbh seems like the sort of thing you might be able to find some terrifying use for...
@njsmith It's probably the opposite of comforting, but my writing PEP 432 was originally motivated by getting __main__
to play nice with the pure Python import system in 3.3, and we've only just begun to move forward with the implementation as a private startup refactoring for 3.7 :)
I was thinking some more about a possible stopgap measure that we might be able to sneak into 3.7 to make something like #176 fast. Basically the idea would be to get just enough help from the interpreter to make checking for unawaited coroutines fast enough that we can afford to do it at every context switch.
(There was some previous discussion of these ideas in #176 starting here: https://github.com/python-trio/trio/pull/176#issuecomment-304490423)
In more detail, the requirements would be:
At a "barrier" (= yield point or catch-all exception handler), reliably and quickly detect whether there are any unawaited coroutines created since the last barrier
If so, get a list of them so we can give a nice error message
Also, we want to be able to disable this checking on a per-task basis, because some tasks might be "hosting" asyncio code (#171). So we need to be able to "consume" the list of unawaited coroutines -- if one task creates some and we think that's OK and leave them be, then we don't want to re-detect those coroutines the next time we check.
There are two fairly natural ways to do this that come to mind. They both involve the same basic strategy: adding two pointers to each coroutine object, which we can use to create a double-linked list holding the coroutines of interest (with the list head stored in the threadstate). The nice thing about an intrusive double-linked list like this is that it's a collection where adding and removing are both O(1) and very cheap (just updating a few pointers).
Keep a thread-local list of live, unawaited coroutines. So coroutines always insert themselves into the list when created, and then remove themselves again when they're either iterated or deallocated.
Then we need:
A way to get and clear the list of coroutines: sys.get_and_clear_unawaited_coroutines()
A way to register a hook that's called whenever an unawaited coroutine is garbage collected: sys.set_unawaited_coroutine_gc_hook
, sys.get_unawaited_coroutine_gc_hook
def unawaited_coroutine_gc_hook(coro):
_gced_unawaited_coros.add(coro)
def barrier():
live_unawaited_coros = sys.get_and_clear_unawaited_coros()
if _gced_unawaited_coros or live_unawaited_coros:
# do expensive stuff here, checking if this task is hosting asyncio, etc.
...
(We need to have both hooks to handle the two cases of (a) a coroutine is created and then immediately garbage collected in between barriers, (b) a coroutine is created and then still live when we hit the barrier.)
For speed, we might want to also have a sys.has_unawaited_coros()
that returns a bool, so that we don't need to bother calling get_and_clear_unawaited_coros
every time and pay the price of allocating an empty list object just to throw it away. %timeit bool([])
says 127 ns and %timeit get_coroutine_wrapper()
(used here as a proxy for a trivial function that just returns a value from the threadstate) says 37 ns, so it's faster but they're both fast enough I don't know whether it matters without some kind of testing.
A minor awkward question here is whether the regular unawaited coro warning would be issued at the same time our hook is called, or whether we'd have some way to suppress it, or...
Oh wait, there's also a more worrisome problem: if we see a coroutine is unawaited and live, and decide that it's OK... what happens if it later gets garbage collected and is still unawaited? We'll see it again, but now in some random other context. I guess what we could do is that when we detect live unawaited coroutines and we're in asyncio-friendly mode, we put them into a WeakSet
(or even a WeakDict
to track which task created them), and then our gc hook checks to see if they're in the WeakSet
. Except... I think __del__
might run after weakrefs are cleared? If so then this whole approach is probably sunk. Or maybe we could somehow mark the coroutines that we've seen before? You can't attach arbitrary attributes to coroutines. I suppose we could do some nasty hack involving the coroutine's locals... those should still be accessible from __del__
.
In the design above, the need to handle GCed unawaited coroutines is the cause of lots of different kinds of awkwardness. What we could do instead is make our tracking list a strong reference, so that coroutines insert themselves when created, and then stay there either until they're iterated or else we remove them manually.
Obviously this can't be enabled by default, because it would mean that instead of warning about unawaited coroutines, the interpreter would just leak them. So the API would be something like:
sys.enable_unawaited_coroutine_tracking()
, sys.disable_unawaited_coroutine_tracking()
, sys.get_and_clear_unawaited_coroutines()
And then everything else is pretty straightforward and obvious. (Though we still have the question about whether we'd want a sys.has_unawaited_coroutines()
for speed.)
I asked @arigo about how these options look from the point of view of PyPy. He said that if sys.get_and_clear_unawaited_coros()
was written carefully, then it would probably be possible for the JIT to inline and optimize out the unnecessary empty list, so that's a useful data point. (Though I guess it might still be better to save it the trouble of needing to, esp. since you're not always in JIT mode.)
He also had a strong preference for the second API on the general grounds that doing anything from __del__
methods is just asking for trouble and creates all kinds of problems for alternative interpreter implementations. Obviously they support it, but it's not cheap. In this particular case it's not a huge practical difference currently, b/c coroutines already have a __del__
method (the one that prints the coroutine '...' was never awaited
message, or potentially throws GeneratorExit
– though PyPy is clever enough to avoid generating a __del__
method just for the latter if there are no yield points with exception handlers). But he felt that option 2 was just simple from an abstract "let's try to make it easier rather than harder to write a python implementation" perspective. (And who knows, maybe there's some chance we could someday get rid of coroutine '...' was never awaited
.)
Nick points out that this kind of list structure might also be useful for more general introspection, similar to threading.enumerate()
: https://github.com/python-trio/trio/pull/176#issuecomment-304492602
Curio, possibly. Not sure, would have to check with Dave.
I think pytest-asyncio and similar libraries might find this useful to easily and reliably catch unawaited coroutines and attribute them to the right test. [Edit: asked pytest-asyncio here: https://github.com/pytest-dev/pytest-asyncio/issues/67 ]
Forgetting to put await on coroutines is not a problem that I'm concerned about. It is unlikely that I would use this in Curio.
As far as the __del__
comments go, coroutine objects are hashable, so WeakSet
seems like it would be a better fit for this purpose than doing custom state manipulation in __del__
.
I'll also note that everything except the "already GC'ed before the next check for unawaited coroutines" case can already be handled by calling sys.set_coroutine_wrapper
with something like (not tested, but should be in the right general direction):
_coro_tracking = threading.local()
_coro_tracking.created = weakref.WeakSet()
_coro_tracking.started = weakref.WeakSet()
def track_coroutines(coro_def):
@functools.wraps(coro_def)
def make_tracked_coro(*args, **kwds):
inner_coro = coro_def(*args, **kwds)
_coro_tracking.created.add(inner_coro)
@types.coroutine
def tracked_coro():
_coro_tracking.started.add(inner_coro)
_coro_tracking.created.remove(inner_coro)
yield from inner_coro
return tracked_coro(*args, **kwds)
return make_tracked_coro
That way, the interaction with the GC would just be normal WeakSet
interaction, while the interaction with coroutine definitions would use the existing wrapper machinery (which trio
is presumably hooking into already).
If you instead want something more like your second case, then switch the created
set to being a regular set such that created coroutines can't be GC'ed before they're seen by the event loop. Whether or not to configure that by default would be up to the author of the wrapper function.
@ncoghlan Yeah, but my concern is that we won't be able to make the speed workable like that. Since this adds per-call and per-context-switch overhead it's probably worthwhile trying to get it as low as possible. But one reason for working through the thoughts here before taking something to bpo is to figure out exactly what semantics we'd want so we can prototype it with set_coroutine_wrapper
and see what happens :-).
yield from
should get your regular context switch overhead back to near-zero, so you're mainly looking at a bit of extra function call overhead per coroutine created. You also have the option of putting the enablement of the wrapper behind an if __debug__:
guard.
Anyway, probably the most useful feedback is that your suggested API option 2 is likely the most viable approach, where the default behaviour is to use a WeakSet (so only non-GC'ed references are checked, which is sufficient for the result = unawaited_coro()
case), and you have an opt-in toggle to switch to using a regular set instead (allowing you to also check for the unawaited_coro()
case).
Meanwhile back in the subthread about possibly making await ...(...)
a single piece of syntax:
OK, I think I see your argument now, and given the leading
await
you should be able to make it work even within the constraints of the LL(1) parsing restriction
I was nervous about this b/c I don't know a lot about parsing, but actually it looks trivial: the grammar rule that currently handles await
expressions is atom_expr: [AWAIT] atom trailer*
, where AWAIT
is the await
token and trailer*
is the function call parentheses (among other things). So in fact await ...(...)
already is a single piece of syntax as far as the parser is concerned; it's only split into await ...
and ...(...)
during the concrete syntax → AST transformation.
Another reason it would be useful to allow foo
to tell whether it was called synchronously foo(...)
or asynchronously await foo(...)
: it would make it possible to transition a function from synchronous to asynchronous or vice-versa with an intermediate period where both work and one emits a DeprecationWarning
. Right now this is mostly impossible outside of some special cases. (I think you can do it iff you have a function that has no return value and is going async→sync.)
(I ran into this with trio's bind
method, see #241, but I expect it will be much much more common in downstream libraries.)
Note to self: think through a version of the __awaitcall__
idea where there's something like a context-local set_coroutine_wrapper
, but it's only invoked by __call__
, not __awaitcall__
, so await asyncfn(...)
stays just as fast, but the coroutine runner can make async def
functions called normally return Future
s on asyncio and Deferred
s on twisted – thus bringing their async/await usability in line with that of C#/Javascript – and trio can make async def
functions called normally raise an error.
Some things to watch out for: wrapper functions, functools.partial
; preserving await fut
+ await sync_fn_returning_future()
; preserving asyncio.run(asyncfn())
; fast access to __awaitcall__
from C and Python (the latter requiring the preservation of funcall optimizations) – maybe __awaitcall__
has an unusual type slot like PyObject* (*awaitcall)(void)
?
I think, that in more than 90% of code, we just write await
before launching async functions.
Think, it would be better to just omit await
and wait for result from async functions automatically.
If task needs to be spawned - nursery can be used.
I think, that it could be implemented somehow. If call to async function returns just coroutine, maybe, there could be created self-executed coroutines to launch just after they were created.
I think it is important to note that a major reason why most async
is just a regular await coro()
may be that the ecosystem for async
is seriously underdeveloped at this point. In other words, the bare await coro()
may not stay as dominant as it is now.
For functions, methods and generators, it is accepted practice to treat them as first-class objects -- passing them around, wrapping them, storing them. People generally learn pretty quickly that func
, obj.meth
and gen()
are "things" that fit well into partial
, sorted
, enumerate
, producing yet again "things". This is helped by the standard library shipping with all of these wrappers and things that make it natural to experience functions, methods and generators as "things" en par with other "things".
Right now, everything async
is very clearly a different breed of "thing". The most obvious case is that enumerate
, map
, filter
do not work on async iterables, which makes common sync patterns very painful to do async. There is the subtle case that coroutines do not work as map
/reduce
/... operations, or property
setters, they don't have literals (lambda
), and so on -- which means there is simply not much else to do with coroutines than await
them.
The only case where async
and sync are on equal footing is ()
-- which trio
also relies on for every case where partial
is recommended. Further special-casing await ...()
versus await ...
and ...()
will likely make it even more difficult to treat coroutines like things equal to other constructs.
[This issue was originally an umbrella list of all the potential Python core changes that we might want to advocate for, but a big discussion about
await
sprouted here so I've moved the umbrella issue to: #103]Original comment:
Can we do anything about how easy it is to forget an await? In retrospect async functions shouldn't implement
__call__
IMO... but probably too late to fix that. Still, it kinda sucks that one of first things in our tutorial is a giant warning about this wart, so worth at least checking if anyone has any ideas...