pauldex / sqlalchemy-firebird

A Firebird dialect for SQLAlchemy using the firebird-driver and/or fdb python Firebird driver
MIT License
22 stars 15 forks source link

SqlAlchemy handling of firebird stream blobs (a.k.a. BlobReader objects) #58

Open bryancole opened 12 months ago

bryancole commented 12 months ago

I'm having a problem reading large blobs from a firebird database, using the new firebird-driver.

For large blobs, the firebird driver (both new and old) returns BlobReader objects, rather than the fully materialised python bytes-objects. These BlobReader objects are file-like and can be read to obtain the binary data. The thing that has changed in the new firebird driver is that the Cursor.close() method now closes all BlobReader objects associated with that cursor. Unfortunately, when sqlalchemy executes a statement returning data (i.e. calls fetchXXX() on the cursor), it always closes the cursor after iterating over it, but before accessing any of the data in the returned rows. Hence, later on when the data is passed to the Dialect TypeDecorator to handle type-conversions, the cursor has been closed so all the BlobReader objects are closed, so reading the data fails.

Although I can't see a way to fix this in the Dialect, I'm wondering if you have any ideas, before I look at modifying the sqlalchemy core to add some sort of hook to customise Cursor closing behaviour. The author of the new firebird driver seems adamant that the BlobReaders ought to be closed when the Cursor is closed.

pauldex commented 11 months ago

It sounds like with fdb you can still read the BlobReader objects after the cursor is closed, but with firebird_driver the BlobReader objects are closed along with the cursor and can no longer be read. Is that right?

A possible workaround is to make sure to have read everything you need before to cursor is closed (e.g., with .all() or .first()) before trying to process the result set.

I’ll add reviewing the cursor closing behavior to the list of things to do.

Thanks,

Paul

From: Bryan Cole @.> Sent: Tuesday, October 10, 2023 9:02 AM To: pauldex/sqlalchemy-firebird @.> Cc: Subscribed @.***> Subject: [pauldex/sqlalchemy-firebird] SqlAlchemy handling of firebird stream blobs (a.k.a. BlobReader objects) (Issue #58)

I'm having a problem reading large blobs from a firebird database, using the new firebird-driver.

For large blobs, the firebird driver (both new and old) returns BlobReader objects, rather than the fully materialised python bytes-objects. These BlobReader objects are file-like and can be read to obtain the binary data. The thing that has changed in the new firebird driver is that the Cursor.close() method now closes all BlobReader objects associated with that cursor. Unfortunately, when sqlalchemy executes a statement returning data (i.e. calls fetchXXX() on the cursor), it always closes the cursor after iterating over it, but before accessing any of the data in the returned rows. Hence, later on when the data is passed to the Dialect TypeDecorator to handle type-conversions, the cursor has been closed so all the BlobReader objects are closed, so reading the data fails.

Although I can't see a way to fix this in the Dialect, I'm wondering if you have any ideas, before I look at modifying the sqlalchemy core to add some sort of hook to customise Cursor closing behaviour. The author of the new firebird driver seems adamant that the BlobReaders ought to be closed when the Cursor is closed.

— Reply to this email directly, view it on GitHub https://github.com/pauldex/sqlalchemy-firebird/issues/58 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABX7Z67JWG5EHUMGAJRIPG3X6VWPTAVCNFSM6AAAAAA52TFJRSVHI2DSMVQWIX3LMV43ASLTON2WKOZRHEZTKNZQGYZTGOI . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/ABX7Z63SE4XT4VPKEODP5Z3X6VWPTA5CNFSM6AAAAAA52TFJRSWGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHHGYEI4M.gif Message ID: @. @.> >

bryancole commented 11 months ago

Thanks for your response. Yes, that's right ( fdb driver doesn't close BlobReaders on Cursor.close(), firebird-driver does).

Sadly, calling .all() or .first() doesn't help. These calls retrieve all data for the query by iterating over the cursor and collecting the data into a list. This happens in the CursorFetchStrategy class (see https://github.com/sqlalchemy/sqlalchemy/blob/a342b3d503f968bbf43f3b2de1f4f623b03a6310/lib/sqlalchemy/engine/cursor.py#L1129C1-L1139C59 ). The cursor is closed immediately after the DB-API2 returns the data as a list (which BlobReader objects in the list, say). Data processing happens later.

This is giving me an idea, that maybe I can subclass the CursorFetchStrategy object to avoid closing the cursor at the end of the fetchXXX() calls. However, I'm not sure how to close the cursor safely after processing is complete. I'll think about this some more.

I have worked around the problem in the short term by setting the stream_blob_threshold config for the driver to be some very large value, such that BlobReaders are never returned. This is OK for my application for the time being.

fdcastel commented 9 months ago

@bryancole did you make any progress about this issue?

I cannot work on this now, but I think is appropriate to cite this comment from SQLAchemy maintainer.

fdcastel commented 9 months ago

Related discussion from firebird-driver maintainer.

bryancole commented 9 months ago

Sorry, I did not. I worked around by forcing the stream-blob threshold to be very high, so all blobs are materialised in my application.

Bryan

On Tue, Dec 19, 2023 at 5:15 PM F.D.Castel @.***> wrote:

@bryancole https://github.com/bryancole did you make any progress about this issue?

I cannot work on this now, but I think is appropriate to cite this comment https://github.com/sqlalchemy/sqlalchemy/discussions/10549#discussioncomment-7404872 from SQLAchemy maintainer.

— Reply to this email directly, view it on GitHub https://github.com/pauldex/sqlalchemy-firebird/issues/58#issuecomment-1863181669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABYLNUR36QEK5DBCZ4JUQDYKHDTLAVCNFSM6AAAAAA52TFJRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRTGE4DCNRWHE . You are receiving this because you were mentioned.Message ID: @.***>

pauldex commented 9 months ago

@bryancole, should this issue remain open?