universe-proton / universe-topology

A universal computer knowledge topology for all the programmers worldwide.
Apache License 2.0
50 stars 0 forks source link

The evolution of coroutine in Python #12

Open justdoit0823 opened 7 years ago

justdoit0823 commented 7 years ago

Coroutine

In wikipedia website, coroutine is defined as followed.

Coroutines are computer-program components that generalize subroutines for non-preemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.

In early Python, coroutine is not natively supported, but only generator which supports yielding and resuming. Normal generators are just used to return value one by one, rather than yielding to different other generators as cooperative tasks. There are two main methods, send and throw resume generator's execution with specified value or exception. Afterwards in some frameworks, such as Tornado, the generator is used to implement framework level coroutine like tornado.gen.coroutine. However in early version of generator, there is a limitation that the yield keyword treats the expression as a normal value and can't delegate control flow to it even though the expression is a generator. Thus tornado.gen.coroutine must do more works and iterate yielded generator's execution at application level.

Generator

generator is a special function can contain any possible yielding points that can be resumed at later. With this feature, we can implement coroutine and schedule cooperative tasks with it.

How generator works

A generator object has two main attributes, frame and code. Every generator has it's own execution frame object with related stack pointer variables and code. When yielding, the interpreter suspends frame's execution and returns the yielded value, but the frame object is still kept. After resuming with send function, the interpreter pushs the argument of send to frame's stack and continues to execute the code with its stack pointer variables.

Different version generatos

The early version generator

In [147]: def foo30():
     ...:     res = yield 123
     ...:     print(res)
     ...:

In [148]: f1 = foo30()

In [149]: f1.send(None)
Out[149]: 123

In [150]: f1.send('resume')
resume
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-150-3f7ca4dd39b2> in <module>()
----> 1 f1.send('resume')

StopIteration:

In [151]: def foo30_subgen():
     ...:     def sub_gen():
     ...:         yield 1
     ...:     res = yield sub_gen()
     ...:     print(res)
     ...:

In [152]: f2 = foo30_subgen()

In [153]: f2.send(None)
Out[153]: <generator object foo30_subgen.<locals>.sub_gen at 0x10dd35bf8>

In [154]: f2.send('resume gen')
resume gen
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-154-bfa6d0b86d34> in <module>()
----> 1 f2.send('resume gen')

StopIteration:

The yield from version generator

In [162]: def foo33_subgen():
     ...:     def sub_gen():
     ...:         res = yield 'sub_gen'
     ...:         print(res, 'in sub_gen')
     ...:         return 'sub_return'
     ...:     res = yield from sub_gen()
     ...:     print(res, 'in main gen')
     ...:
     ...:

In [163]: f3 = foo33_subgen()

In [164]: f3.send(None)
Out[164]: 'sub_gen'

In [165]: f3.send('sub value')
sub value in sub_gen
sub_return in main gen
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-165-14258efb8e4b> in <module>()
----> 1 f3.send('sub value')

StopIteration:

In Python3.3, yield from syntax is introduced, which eliminates the above limitation and supports linked generator execution.

When yield from is used, it treats the supplied expression as a subiterator. All values produced by that subiterator are passed directly to the caller of the current generator’s methods. Any values passed in with send() and any exceptions passed in with throw() are passed to the underlying iterator if it has the appropriate methods. If this is not the case, then send() will raise AttributeError or TypeError, while throw() will just raise the passed in exception immediately.

When the underlying iterator is complete, the value attribute of the raised StopIteration instance becomes the value of the yield expression. It can be either set explicitly when raising StopIteration, or automatically when the sub-iterator is a generator (by returning a value from the sub-generator).

Changed in version 3.3: Added yield from to delegate control flow to a subiterator.

This is a syntax enhancement in the Python interpreter and two new opcodes GET_YIELD_FROM_ITER and YIELD_FROM have been introduced. The following are the bytecode comparison.

Early version generator bytecode

8 LOAD_FAST                0 (sub_gen)
10 CALL_FUNCTION            0
12 YIELD_VALUE
14 STORE_FAST               1 (res)

And yield from version generator bytecode

 8 LOAD_FAST                0 (sub_gen)
 10 CALL_FUNCTION            0
 12 GET_YIELD_FROM_ITER
 14 LOAD_CONST               0 (None)
 16 YIELD_FROM
 18 STORE_FAST               1 (res)

The yield syntax doesn't interprete the expression value and return the same value to generator's caller. While, yield from delegates control flow to the subiterator when the expression value is also a generator. This is the main difference. With the new opcode YIELD_FROM, things become easier.

Coroutine in Python

asyncio

In Python3.4, asyncio is introduced to support native asynchronous io api. And a new function asyncio.coroutine is used to decorate a generator function written with yield from as a coroutine.

def test_asyncio_coroutine(seconds):
    import asyncio

    @asyncio.coroutine
    def foo_coroutine(n):
        yield from asyncio.sleep(n)
        print('sleep', n, 'seconds')

    loop = asyncio.get_event_loop()
    print('start at', loop.time())
    loop.run_until_complete(foo_coroutine(seconds))
    print('finish at', loop.time())

test_asyncio_coroutine(3)

For more details, go to asyncio.

Native coroutine

In Python 3.5, coroutine is natively supported with async and await syntax. Now we can write a coroutine without asyncio.coroutine and simplify the definition as the following.

def test_asyncio_coroutine(seconds):
    import asyncio

    async def foo_coroutine(n):
        await asyncio.sleep(n)
        print('sleep', n, 'seconds')

    loop = asyncio.get_event_loop()
    print('start at', loop.time())
    loop.run_until_complete(foo_coroutine(seconds))
    print('finish at', loop.time())

test_asyncio_coroutine(3)

Comparing to yield from, there is only one new opcode GET_AWAITABLE which works in the similar way as GET_YIELD_FROM_ITER.

A simple coroutine's bytecode

In [272]: async def foo():
     ...:     await bar()
     ...:

In [273]: dis.dis(foo)
  2           0 LOAD_GLOBAL              0 (bar)
              2 CALL_FUNCTION            0
              4 GET_AWAITABLE
              6 LOAD_CONST               0 (None)
              8 YIELD_FROM
             10 POP_TOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

The magic here is YIELD_FROM opcode which does subiterator's single step execution and returns subiterator's yielded value. However this single opcode is executed twice at every yield from expression. It's a trick implemented at bytecode level when yield from syntax came out.

Native support here means that there is a new object type coroutine in the interpreter, which can be used to write coroutine programs in a new standard syntax without doing any more works for the coroutine's execution.

Conclusion

From the beginning generator to the final native coroutine, the fundamental changes a little. With an frame object, the execution process is devided into different continuous parts which behaves like the subroutine or function shares the whole stack space. The stack pointer variables and opcode flow make this worked as the abstract coroutine.

Reference