zodb / relstorage

A backend for ZODB that stores pickles in a relational database.
Other
54 stars 46 forks source link

BLOB streaming at SQL level rather than driver level #451

Closed tlocke closed 3 years ago

tlocke commented 3 years ago

Hi, maintainer of pg8000 here, thanks for using pg8000 :-) I noticed on your features supported by databases table, on the PostgreSQL row and 'Streaming Blobs' column it says, 'With psycopg2 driver'. I wonder if it's an option to do streaming at the SQL level (using the large objection functions) rather than db driver level? That way, blob streaming would work with any DBAPI-2 compliant driver. So something like this:

>>> # Create a BLOB and get its oid
>>> data = b'hello'
>>> cur = conn.cursor()
>>> cur.execute("SELECT lo_from_bytea(0, %s)", [data])
>>> oid = cur.fetchone()[0]
>>>
>>> # Create a table and store the oid of the BLOB
>>> cur.execute("CREATE TEMPORARY TABLE image (raster oid)")
>>> cur.execute("INSERT INTO image (raster) VALUES (%s)", [oid])
>>>
>>> # Retrieve the data using the oid
>>> cur.execute("SELECT lo_get(%s)", [oid])
>>> cur.fetchall()
([b'hello'],)
>>>
>>> # Add some data to the end of the BLOB
>>> more_data = b' all'
>>> offset = len(data)
>>> cur.execute("SELECT lo_put(%s, %s, %s)", [oid, offset, more_data])
>>> cur.execute("SELECT lo_get(%s)", [oid])
>>> cur.fetchall()
([b'hello all'],)
>>>
>>> # Download a part of the data
>>> cur.execute("SELECT lo_get(%s, 6, 3)", [oid])
>>> cur.fetchall()
([b'all'],)

Anyway, just a thought -:)

jamadden commented 3 years ago

Thanks for the feedback, and thanks for the driver!

I wonder if it's an option to do streaming at the SQL level (using the large objection functions) rather than db driver level?

Perhaps a better description of that column would be "Native Streaming Blobs", because, in fact, using the large object functions via SQL is exactly what RelStorage does on pg8000 (or when using psycopg2 in async mode). But this has to divide an upload or download into chunks and perform iterative copying at the Python level (instead of natively in libpq at speed), and IIRC there are differences at the protocol level too. Ultimately it's closer to what MySQL does ("No, emulated via chunking.") than what psycopg2 does (even though the _lobject.py file provides the same interface on pg8000).

tlocke commented 3 years ago

Cool, I'll close the issue, thanks for the quick response 👍