oremanj / greenback

Reenter an asyncio or Trio event loop from synchronous code
https://greenback.readthedocs.io/
Other
77 stars 2 forks source link

greenback: reenter an asyncio or Trio event loop from synchronous code

.. image:: https://img.shields.io/pypi/v/greenback.svg :target: https://pypi.org/project/greenback :alt: Latest PyPI version

.. image:: https://img.shields.io/badge/docs-read%20now-blue.svg :target: https://greenback.readthedocs.io/en/latest/?badge=latest :alt: Documentation status

.. image:: https://github.com/oremanj/greenback/actions/workflows/ci.yml/badge.svg :target: https://github.com/oremanj/greenback/actions/workflows/ci.yml :alt: Automated test status

.. image:: https://codecov.io/gh/oremanj/greenback/branch/master/graph/badge.svg :target: https://codecov.io/gh/oremanj/greenback :alt: Test coverage

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg :target: https://github.com/ambv/black :alt: Code style: black

.. image:: http://www.mypy-lang.org/static/mypy_badge.svg :target: http://www.mypy-lang.org/ :alt: Checked with mypy

Python 3.5 introduced async/await syntax for defining functions that can run concurrently in a cooperative multitasking framework such as asyncio or Trio <https://trio.readthedocs.io/>. Such frameworks have a number of advantages over previous approaches to concurrency: they scale better than threads and are clearer about control flow <https://glyph.twistedmatrix.com/2014/02/unyielding.html> than the implicit cooperative multitasking provided by gevent. They're also being actively developed to explore some new ideas about concurrent programming <https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/>__.

Porting an existing codebase to async/await syntax can be challenging, though, since it's somewhat "viral": only an async function can call another async function. That means you don't just have to modify the functions that actually perform I/O; you also need to (trivially) modify every function that directly or indirectly calls a function that performs I/O. While the results are generally an improvement ("explicit is better than implicit"), getting there in one big step is not always feasible, especially if some of these layers are in libraries that you don't control.

greenback is a small library that attempts to bridge this gap. It allows you to call back into async code from a syntactically synchronous function, as long as the synchronous function was originally called from an async task (running in an asyncio or Trio event loop) that set up a greenback "portal" as explained below. This is potentially useful in a number of different situations:

greenback requires Python 3.8 or later and an implementation that supports the greenlet library. Either CPython or PyPy should work. There are no known OS dependencies.

Quickstart

Example

Suppose you start with this async-unaware program:

.. code-block:: python

import subprocess

def main():
    print_fact(10)

def print_fact(n, mult=1):
    """Print the value of *n* factorial times *mult*."""
    if n <= 1:
        print_value(mult)
    else:
        print_fact(n - 1, mult * n)

def print_value(n):
    """Print the value *n* in an unreasonably convoluted way."""
    assert isinstance(n, int)
    subprocess.run(f"echo {n}", shell=True)

if __name__ == "__main__":
    main()

Using greenback, you can change it to run in a Trio event loop by changing only the top and bottom layers, with no change to print_fact().

.. code-block:: python

import trio
import greenback

async def main():
    await greenback.ensure_portal()
    print_fact(10)

def print_fact(n, mult=1):
    """Print the value of *n* factorial times *mult*."""
    if n <= 1:
        print_value(mult)
    else:
        print_fact(n - 1, mult * n)

def print_value(n):
    """Print the value *n* in an unreasonably convoluted way."""
    assert isinstance(n, int)
    greenback.await_(trio.run_process(f"echo {n}", shell=True))

if __name__ == "__main__":
    trio.run(main)

FAQ

Why is it called "greenback"? It uses the greenlet <https://greenlet.readthedocs.io/en/latest/> library to get you back to an enclosing async context. Also, maybe it saves you money <https://www.dictionary.com/browse/greenback> (engineering time) or something.

How does it work? After you run await greenback.ensure_portal() in a certain task, that task will run inside a greenlet. (This is achieved by interposing a "shim" coroutine in between the event loop and the coroutine for your task; see the source code for details.) Calls to greenback.await_() are then able to switch from that greenlet back to the parent greenlet, which can easily perform the necessary await since it has direct access to the async environment. The task greenlet is then resumed with the value or exception produced by the await.

Should I trust this in production? Maybe; try it and see. The technique is rather low-level, and has some minor performance implications <https://greenback.readthedocs.io/en/latest/principle.html#performance>__ (any task in which you call await greenback.ensure_portal() will run a bit slower), but we're in good company: SQLAlchemy's async ORM support is implemented in much the same way. greenback itself is a fairly small amount of pure-Python code on top of greenlet. (There is one small usage of ctypes to work around a knob that's not exposed by the asyncio acceleration extension module on CPython.) greenlet is a C module full of platform-specific arcana, but it's been around for a very long time and popular production-quality concurrency systems such as gevent rely heavily on it.

What won't work? A few things:

License

greenback is licensed under your choice of the MIT or Apache 2.0 license. See LICENSE <https://github.com/oremanj/greenback/blob/master/LICENSE>__ for details.