Open ncoghlan opened 7 years ago
As covered in bpo-31183, correctly disassembling functions, generator-iterators, coroutines, and async generator-iterators currently requires checking for 4 different attributes:
This could be simplified if the latter three all also exposed their code attributes as __code__
.
Some possibly helpful background (adapted from a discussion in PR 3077):
It looks as though gi_code
was added to generators in bpo-1473257. At this time, function bytecode was still stored in f.func_code
, so gi_code
was a clear analogy.
My best guess is that gi_code
was not chosen to deliberately create a namespace difference between functions and generators, but just because it wouldn't have made sense to add a func_code
attribute to generators. In other words, if functions had had code_object
instead of func_code
at the time, bpo-1473257 might well have proposed code_object
for generators. This doesn't, of course, mean that there might not be some post hoc reason to distinguish this part of the namespace now.
Then func_code
was changed in 3.0 to __code__
as part of the f.func_X
to f.__X__
renaming. The 3.0 whatsnew explained that the purpose of this change was to "free up [the f.func_X] names in the function attribute namespace for user-defined attributes". But this wasn't done for the analogous code object in generators. On a quick look, I didn't find any discussion of this at the time, but that doesn't mean there wasn't any.
A related issue (since this issue is contemplating restructuring these objects anyway):
Other than ??_code
, none of the f.func_X
attributes which were eliminated in 3.0 have direct equivalents in generator-iterators, coroutines, or async generator-iterators.
However, all three of those have [gi|cr|ag]_running
and [gi|cr|ag]_frame
attributes. Generator-iterators also have gi_yieldfrom
, and coroutines and async generator-iterators have [cr|ag]_await
.
On a clean slate (with no attention paid to issues around breaking changes or how these attributes are already used in existing code), is there an argument for dundering some or all of these along with __code__
? Or is special casing a better pattern when dealing with these other attributes?
I think there's a strong case for a generic __frame__ attribute, since there are plenty of useful things you can do given "object with a linked frame" as a protocol (like extract the current locals namespace). We may even want to include traceback objects in that particular attribute access consolidation (since they have a tb_frame attribute).
Looking at the way inspect.getgeneratorstate() and inspect.getcoroutinestate() are implemented, I also think there may be a case for consolidating [gi|cr|ag]_running
into a single __running__
attribute, such that for objects that provide a __frame__
attribute:
__running__
doesn't exist -> not an asynchronous operation__running__ = True
-> running asynchronous operation__running__ = False
-> paused or halted asynchronous operationThen, rather than adding an inspect API specifically for async generators, we could add a general "inspect.getasyncstate()" one that covered all the object types, with the possible states:
ASYNC_RUNNING: __running__ == True, assert __frame__ is not None
ASYNC_CREATED: __running__ == False, __frame__.f_lasti == -1
ASYNC_CLOSED: __running__ == False, __frame__ is None
ASYNC_SUSPENDED: __running__ == False, neither closed nor created
Properly interpreting gi_yieldfrom
and [cr|ag]_await
is a bit trickier, since they require additional knowledge of how the control flow for those objects actually work. However, if we run with the "async operation introspection protocol" idea, then a suitably generic name would be __async_call__
, and we could make it a doubly-linked list for the coroutine and async generator case by setting an __async_return__
attribute on the target (to say where we expect control to return to when we're done).
Actually, __async_return__ would be applicable for the synchronous generator case as well, since it would link to the generator containing the yield from
expression.
Slight amendment: __delegated_to__
and __returns_to__
might be better names for the doubly-linked list of the async call chain.
"__async_call" and "__async_return" both have the problem that they look like imperative commands (since "call" and "return" are typically used as verbs), and collection referring to "yield from" and "await" as "asynchronous delegation" would help avoid potential confusion with regular synchronous function calls.
I've also retitled the issue to cover the broader scope that also addresses the needs of the inspect module, not just the dis module.
I like where this is heading! Aside from the cleaner patterns for handling these objects, I think it'll make it a little easier for people who are just starting to use asynchronous objects in Python (e.g. me) to grasp what's similar about them.
+1 that __async_call__
could be confusing, for the general reason you mention, but in particular because it would look exactly analogous to function.__call__()
(what the French call a "faux ami").
A not-thought-out alternative: what about noun-ing the verbs into __async_calls__
and __async_returns__
(or maybe __async_returns_to__
)?
BTW, I was thinking of taking a quick run at bpo-31197 while this simmers. I don't /think/ that risks being eventually mooted by these changes; if anything, it might be easier to adapt dis
to this proposal if that refactoring has already been done. LMK if you think that's the wrong order, though.
I agree https://bugs.python.org/issue31197 is orthogonal - refactoring the current logic is useful regardless of what we do at the object attribute layer.
Regarding __delegated_to/returns_to__, the reason I like *not* having "async" in the attribute names is that generators are actually used as a synchronous construct (via synchronous for loops and next() calls), even though they technically define an asynchronous operation from the interpreter's point of view.
The reason I like omitting "call" (or "calls") from the name is that even though "yield from" and "await" are both sometimes described as providing syntax for asynchronous calls, that terminology is really only defensible for "await" - PEP-380 describes the "yield from" operation as delegating to a subgenerator, and I think that's a better way of framing the general concept.
I started a local PR at https://github.com/ncoghlan/cpython/pull/1/files to explore what this might look like in practice.
I think that what I've done so far shows that generic __frame and __running attributes would be sufficient to extend the inspect module's state introspection support to also cover async generators, while also allowing for the soft (i.e. documentation only) deprecation of the generator and coroutine specific variants of those APIs.
bpo-31183 already showed the potential value of a __code__ attribute, since it would allow all of the asynchronous operations to be handled by the same code path that already handles functions.
I'm less sure about __delegated_to/returns_to__, since we don't have *any* code in the standard library that reads gi_yieldfrom, and the only code that reads cr_await is a Python 3.5 compatibility hack in asyncio.
I started a local PR at https://github.com/ncoghlan/cpython/pull/1/files to explore what this might look like in practice.
Looks good to me.
I'm less sure about __delegated_to/returns_to__, since we don't have *any* code in the standard library that reads gi_yieldfrom, and the only code that reads cr_await is a Python 3.5 compatibility hack in asyncio.
I'm -1 on this too. gi_yieldfrom and cr_await are very special and rarely used things. Refactoring or unifying them isn't trivial and ultimately not as useful.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-feature', '3.7']
title = 'Define a general "asynchronous operation introspection" protocol'
updated_at =
user = 'https://github.com/ncoghlan'
```
bugs.python.org fields:
```python
activity =
actor = 'yselivanov'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation =
creator = 'ncoghlan'
dependencies = []
files = []
hgrepos = []
issue_num = 31230
keywords = []
message_count = 10.0
messages = ['300468', '300471', '300472', '300475', '300476', '300477', '300517', '300564', '300580', '300587']
nosy_count = 3.0
nosy_names = ['ncoghlan', 'yselivanov', 'syncosmic']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue31230'
versions = ['Python 3.7']
```