Closed oremanj closed 6 years ago
Merging #14 into master will not change coverage. The diff coverage is
100%
.
@@ Coverage Diff @@
## master #14 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 7 7
Lines 776 817 +41
Branches 59 70 +11
=====================================
+ Hits 776 817 +41
Impacted Files | Coverage Δ | |
---|---|---|
async_generator/_tests/test_async_generator.py | 100% <100%> (ø) |
:arrow_up: |
async_generator/_impl.py | 100% <100%> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 17521f8...6aa8b84. Read the comment docs.
gentle ping
Sorry for the delay!
This is some impressive hacking. But I have to admit, I'm not as impressed with the whole idea as I was when I first wrote it... it's a lot of complexity for a cute gimmick that only works on CPython :-). So while the code looks fine on a quick once-over, I'm not sure if we actually want the functionality or not. (Sorry if the comments misled you here, I had sort of forgotten about those...) I'd be interested in any thoughts on this -- for example, do you have a use case? I assume you had some reason for digging into this :-)
I didn't have a specific reason at first - I came across the commented-out code and thought, "this sounds fun to work out!", so I did.
Thinking about it some more... I think the only real utility is if we could use this to make @async_generator automatically produce a native async generator on 3.6+, which would have both interoperability and performance improvements. Interoperability is of limited use given that native async generators can't use "yield from", and we may or may not care about the performance - I know your feelings about microbenchmarks. :-) Since some numbers are better than none, though, I extended the microbenchmark from PEP 525 to compare the analogous @async_generator and two ways I thought of for making a looks-like-an-@async_generator-but-is-actually-native:
N = 10 ** 7
from async_generator import async_generator, yield_
# iterate over an async iterator and return the sum of the values provided
async def test(ait):
sum = 0
async for i in ait:
sum += i
return sum
# run a secretly-sync coroutine
def run(coro):
try:
coro.send(None)
except StopIteration as ex:
return ex.value
else:
raise RuntimeError("wasn't sync")
# native async generator
async def agen():
for i in range(N):
yield i
# current @async_generator without ctypes hackery
@async_generator
async def agen2():
for i in range(N):
await yield_(i)
# hypothetical way of making a native asyncgen out of an async function that
# yields values by await yield_native(val)
import types, ctypes, functools
dll = ctypes.CDLL(None)
dll._PyAsyncGenValueWrapperNew.restype = ctypes.py_object
dll._PyAsyncGenValueWrapperNew.argtypes = (ctypes.py_object,)
@types.coroutine
def yield_native(val):
yield dll._PyAsyncGenValueWrapperNew(val)
def native_async_generator(afn):
@functools.wraps(afn)
async def wrapper(*a, **kw):
if False:
yield None # force this to be an async generator
await afn(*a, **kw)
return wrapper
@native_async_generator
async def agen3():
for i in range(N):
await yield_native(i)
# another hypothetical way, more Windows-friendly
async def wrapper_maker():
value = None
while True:
value = yield value
# get an async generator object out
wrapper_maker = wrapper_maker()
# transmute it to a regular generator object
type_p = ctypes.c_ulong.from_address(id(wrapper_maker) + ctypes.sizeof(ctypes.c_ulong))
assert type_p.value == id(types.AsyncGeneratorType)
type_p.value = id(types.GeneratorType)
# now wrapper_maker.send(x) returns an AsyncGenWrappedValue of x
wrapper_maker.send(None)
@types.coroutine
def yield_native_2(val):
yield wrapper_maker.send(val)
@native_async_generator
async def agen4():
for i in range(N):
await yield_native_2(i)
# async iterator
class AIter:
def __init__(self):
self.i = 0
def __aiter__(self):
return self
async def __anext__(self):
i = self.i
if i >= N:
raise StopAsyncIteration
self.i += 1
return i
results:
In [2]: %time run(test(agen())) # native asyncgen
CPU times: user 1.5 s, sys: 0 ns, total: 1.5 s
Wall time: 1.5 s
Out[2]: 49999995000000
In [3]: %time run(test(agen2())) # @async_generator
CPU times: user 39.5 s, sys: 0 ns, total: 39.5 s
Wall time: 39.5 s
Out[3]: 49999995000000
In [4]: %time run(test(agen3())) # @native_async_generator + yield_native
CPU times: user 8.63 s, sys: 3.33 ms, total: 8.63 s
Wall time: 8.63 s
Out[4]: 49999995000000
In [5]: %time run(test(AIter())) # async iterator
CPU times: user 4.27 s, sys: 0 ns, total: 4.27 s
Wall time: 4.27 s
Out[5]: 49999995000000
In [6]: %time run(test(agen4())) # @native_async_generator + yield_native_2
CPU times: user 4.83 s, sys: 0 ns, total: 4.83 s
Wall time: 4.83 s
Out[6]: 49999995000000
Would you be receptive to a patch that makes @async_generator work like agen4 in this example on CPython 3.6+? I know the "transmute an async generator to a regular generator" operation is sketchy, but I looked through genobject.c and I'm pretty well convinced it's safe -- an async generator object is a generator object with a few extra fields tacked on at the end, only one of which is an object pointer (ag_finalizer) and that's NULL until the first call to asend()/athrow()/etc (which we never invoke on our wrapper_maker).
Wow, that's super clever. I like that it's much simpler than the current version of this hack, and that it can be motivated as an invisible optimization. (Maybe we wouldn't even want to document that yield_from_
works in native async generators.)
I suspect you'll run into a bit of trouble getting that to pass the current test suite, because we have some tests to check that we don't suffer from bpo-32526, and native async generators do suffer from that bug. Someone should probably fix that at some point... But I guess merely working as well as native async generators would be OK, and we could mark that test as xfail
for now (on configurations where the hack is used).
I think this will also cause behavioral differences with respect to the async generator GC hooks. Currently @async_generator
functions never interact with those hooks (unlike native async generators), and I think with your hack it will start using the GC hooks. This is arguably the right thing. There's some discussion in #13. It's definitely complicated enough that we'd want to add tests for it though! And once PyPy gets native async generator support then we'll probably need to also implement GC hook support inside AsyncGenerator
, since it won't be using this hack.
I've started working on implementing this for async_generator, and the biggest wrinkle I've run into is that native async generators don't support returning values other than None. Combined with the conversion of explicitly raised StopAsyncIteration exceptions into RuntimeErrors, this means there's (AFAICT) no way to make an @async_generator, implemented as above as a native async generator awaiting an async function that yields magic wrapper objects, return something. This breaks a few of the tests, mostly of yield_from_
. I wonder, though... do we care? Is anyone actually using the fact that async generators can return values? In the long run, does the async_generator project want to support things that Py3.6 doesn't support natively, or does it just want to provide a way for async-generator-using code to be Py3.5 compatible?
@oremanj Oh, I see, that's unfortunate.
In the long run, does the async_generator project want to support things that Py3.6 doesn't support natively, or does it just want to provide a way for async-generator-using code to be Py3.5 compatible?
Historically, it's been both a handy backport library, and also a laboratory for future technology... in particular, 3.6's native async generators were actually inspired by this library rather than vice versa (which is why the implementation is sufficiently similar to allow for these wacky hacks!), and the yield from
implementation is partly intended as a demonstration of how CPython could implement it in a future release.
I hate saying this, but: we need to keep the old implementation around anyway for 3.5 and for PyPy. Maybe we should expose both and let users pick? @async_generator(uses_return=True)
or something?
Merging #14 into master will not change coverage. The diff coverage is
100%
.
@@ Coverage Diff @@
## master #14 +/- ##
======================================
Coverage 100% 100%
======================================
Files 7 7
Lines 776 1064 +288
Branches 59 98 +39
======================================
+ Hits 776 1064 +288
Impacted Files | Coverage Δ | |
---|---|---|
async_generator/_tests/test_async_generator.py | 100% <100%> (ø) |
:arrow_up: |
async_generator/_tests/test_util.py | 100% <100%> (ø) |
:arrow_up: |
async_generator/_impl.py | 100% <100%> (ø) |
:arrow_up: |
async_generator/_tests/conftest.py | 100% <100%> (ø) |
:arrow_up: |
async_generator/__init__.py | 100% <100%> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 17521f8...faad1da. Read the comment docs.
Here's a substantial rewrite of the patch that I think accomplishes all of our goals:
I reported the pypy segfault: https://bitbucket.org/pypy/pypy/issues/2786/pypy3-interpreter-segfault-somewhere
This is kind of an unholy amalgamation of three separate changes at this point, so I'm going to close it and make new PRs for the separate changes.
sys.set_asyncgen_hooks()
); on 3.6+, our hooks are the same as the system ones, and regardless of version they can be accessed viaasync_generator.set_asyncgen_hooks()
and so on@async_generator
to produce a native async generator. Automatically detect whether this is possible (we need to be on CPython 3.6+ and the async function we were given needs to only ever return None). Since there are a few minor semantic differences, native mode currently must be requested using@async_generator_native
, and plain@async_generator
warns on code whose behavior would differ with _native. In a future release we can make _native the default, and require people who really need ag_running to work correctly (for example) to use@async_generator_legacy
.