MagicStack / asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio.
Apache License 2.0
6.96k stars 402 forks source link

Large Object support #826

Open al-dpopowich opened 3 years ago

al-dpopowich commented 3 years ago

Are there any plans to have direct support for large objects for efficient streaming of data?

My use case: my webapp supports uploads of binary data files. These files are stored with TOAST (bytea) which is fine: these files are not directly downloaded via the app, and even if they were, we're talking 10s of MB, so I'm not worried about memory footprint for an individual record. HOWEVER, part of the requirements for this app is that all these files can be downloaded in a single zip file. This can be 100s of MBs. My plan: kickoff a background task that builds the zip file, then stores the zip file in PG as a large object. The question then is: providing an efficient download via my webapp (aiohttp).

I could stream it with a loop around, e.g.:

SELECT lo_get(data_oid, :offset, :chunksize) from zipstorage where id = :id

where chunksize might be 1MB and offset increases by 1MB with each iteration, stopping the iteration when the returned data is < 1MB.

Might there be a more direct, efficient way? E.g., as with psycopg2's lobject?

Other suggestions most welcome.

Thanks!

elprans commented 2 years ago

The is no dedicated API for large objects, but you can use the lo_ functions directly. A dedicated API would be nice, though I've got no bandwidth to implement it myself.