crate / crate-python

Python DB API client library for CrateDB, using HTTP.
https://cratedb.com/docs/python/
Apache License 2.0
79 stars 31 forks source link

[Feature] async support #619

Open robd003 opened 2 months ago

robd003 commented 2 months ago

I'm doing a lot of bulk inserts that are generated from user activity. (Some times we have 10 rows, other times we have 100,000 rows but broken up into blocks of 1,000)

It would be really helpful to have async support so that we don't have issues of blocking while waiting for Crate to ingest 1,000 rows at a time.

amotl commented 1 month ago

Dear Robert,

thank you for writing in. Are you looking at asnyc support for the HTTP driver, or async support for the SQLAlchemy dialect?

In general, you can always use the asyncpg or psycopg3 libraries, as outlined on this documentation page.

If you are looking at async support for SQLAlchemy, on top of the PostgreSQL drivers enumerated above, that patch might bring in what you are looking for.

In this case, please have a look at those examples, which can be used right away when following the corresponding dependency specifications.

With kind regards, Andreas.

robd003 commented 1 month ago

@amotl Just wanted to be able to use the HTTP bulk API with async, don't need async SQLAlchemy at the moment

It seems like the HTTP bulk API processes bulk inserts the most efficiently than using Postgres

mfussenegger commented 1 month ago

The main purpose of this library is to implement the DBAPI. Given that there is no async version of it yet, I'm not sure what adding async capabilities into this library would bring on the table. It's not much effort to use a async http library directly - or one of the async pg driver variants.

E.g. with aiohttp (untested):

async with aiohttp.ClientSession as session:
    data = json.dumps({
        "stmt": "insert into ...",
        "bulk_args": [
            [...],
            [...],
        ]
    })
    async with session.post(server_url,
                            data=data,
                            headers={'Content-Type': 'application/json'}) as resp:
        result = await resp.json()