MagicStack / asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
6.88k stars 399 forks source link

Feature request: Support for asynchronous command processing #1141

Open FeldrinH opened 6 months ago

FeldrinH commented 6 months ago

libpq has support for something called asynchronous command processing (https://www.postgresql.org/docs/current/libpq-async.html).

Thea idea is fairly simple: you can queue up multiple queries on one connection without waiting for the previous result and then wait for all the results at the end. This avoids waiting for network round trips between queries, which at least in theory should have the same performance benefits as running multiple queries with one network round trip.

It would be useful if asyncpg could support the same concept. The Python API should be fairly straightforward. You just start multiple coroutines with queries and then await all of them together at the end. Something like this:

q1 = conn.fetch('SELECT $1', 1)
q2 = conn.fetch('SELECT $1', 2)
q3 = conn.fetch('SELECT $1', 3)
results = await asyncio.gather(q1, q2, q3) # (currently this raises InterfaceError)

This could have significant performance benefits in situations where you make multiple fast independent queries, especially if there is a large network round-trip delay.

AFAIK implementing this is definitely possible in theory. In practice, the complexity depends a lot on how the internals of asyncpg are structured, which I am not familiar with.

EDIT: This request is motivated by the same goal as https://github.com/MagicStack/asyncpg/issues/839, but my proposal is to specifically use asynchronous command processing without pipelining. If my understanding of the extended query protocol is correct, it should be possible to issue multiple queries without waiting for results between them but with a sync point after each query, which helps avoid the complex error handling rules of pipeline mode. (If my understanding of extended query protocol is not correct, this may be impossible. Feel free to correct me.)

oliora commented 4 weeks ago

It looks like async commands or pipelining would help in our case a lot. Our case is: start a transaction, insert 1-10K of rows into around 10 tables and commit the transaction.