athenianco / asyncpg-rkt

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
45 stars 1 forks source link
hacktoberfest

asyncpg-๐Ÿš€ -- A fast PostgreSQL Database Client Library for Python/asyncio that returns numpy arrays

.. image:: https://github.com/athenianco/asyncpg-rkt/workflows/Tests/badge.svg :target: https://github.com/athenianco/asyncpg-rkt/actions?query=workflow%3ATests+branch%3Amaster :alt: GitHub Actions status .. image:: https://img.shields.io/pypi/v/asyncpg-rkt.svg :target: https://pypi.python.org/pypi/asyncpg-rkt

asyncpg-rkt is a fork of asyncpg, a database interface library designed specifically for PostgreSQL and Python/asyncio. asyncpg is an efficient, clean implementation of PostgreSQL server binary protocol for use with Python's asyncio framework. You can read more about asyncpg in an introductory blog post <http://magic.io/blog/asyncpg-1m-rows-from-postgres-to-python/>_.

asyncpg-rkt extends asyncpg as follows:

asyncpg-rkt provides the best performance when there are thousands of rows returned and the field types map to numpy.

Read the blog post with the introduction.

asyncpg-๐Ÿš€ requires Python 3.8 or later and is supported for PostgreSQL versions 9.5 to 14. Older PostgreSQL versions or other databases implementing the PostgreSQL protocol may work, but are not being actively tested.

Documentation

The project documentation can be found here <https://magicstack.github.io/asyncpg/current/>_.

See below about how to use the fork's special features.

Performance

In our testing asyncpg is, on average, 3x faster than psycopg2 (and its asyncio variant -- aiopg).

.. image:: https://raw.githubusercontent.com/athenianco/asyncpg-rkt/master/performance.png :target: https://gistpreview.github.io/?b8eac294ac85da177ff82f784ff2cb60

The above results are a geometric mean of benchmarks obtained with PostgreSQL client driver benchmarking toolbench <https://github.com/MagicStack/pgbench>_ in November 2020 (click on the chart to see full details).

Further improvement from writing numpy arrays is ~20x:

.. image:: https://raw.githubusercontent.com/athenianco/asyncpg-rkt/master/benchmark_20220522_142813.svg

.. image:: https://raw.githubusercontent.com/athenianco/asyncpg-rkt/master/benchmark_20220522_143838.svg

Features

asyncpg implements PostgreSQL server protocol natively and exposes its features directly, as opposed to hiding them behind a generic facade like DB-API.

This enables asyncpg to have easy-to-use support for:

Installation

asyncpg-๐Ÿš€ is available on PyPI and requires numpy 1.21+. Use pip to install::

$ pip install asyncpg-rkt

Basic Usage

.. code-block:: python

import asyncio
import asyncpg
from asyncpg.rkt import set_query_dtype
import numpy as np

async def run():
    conn = await asyncpg.connect(user='user', password='password',
                                 database='database', host='127.0.0.1')
    dtype = np.dtype([
        ("a", int),
        ("b", "datetime64[s]"),
    ])
    array, nulls = await conn.fetch(
        set_query_dtype('SELECT * FROM mytable WHERE id = $1', dtype),
        10,
    )
    await conn.close()

loop = asyncio.get_event_loop()
loop.run_until_complete(run())

License

asyncpg-๐Ÿš€ is developed and distributed under the Apache 2.0 license, just like the original project.