trinodb / trino-python-client

Python client for Trino
Apache License 2.0
309 stars 151 forks source link

Fix prepared statement handling #242

Closed hashhar closed 1 year ago

hashhar commented 1 year ago

Description

The prepared statement handling code assumed that for each query we'll always receive some non-empty response even after the initial response which is not a valid assumption.

This assumption worked because earlier Trino used to send empty fake results even for queries which don't return results (like PREPARE and DEALLOCATE) but is now invalid with https://github.com/trinodb/trino/commit/bc794cd49a616fa95eecfcc384b761f67176240b.

The other problem with the code was that it leaked HTTP protocol details into dbapi.py and worked around it by keeping a deep copy of the request object from the PREPARE execution and re-using it for the actual query execution.

The new code fixes both issues by processing the prepared statement headers as they are received and storing the resulting set of active prepared statements on the ClientSession object. The ClientSession's set of prepared statements is then rendered into the prepared statement request header in TrinoRequest. Since the ClientSession is created and reused for the entire Connection this also means that we can now actually implement re-use of prepared statements within a single Connection.

Non-technical explanation

Fix errors when using prepared statements with Trino >= 398.

Release notes

( ) This is not user-visible and no release notes are required. ( ) Release notes are required, please propose a release note for me. (x) Release notes are required, with the following suggested text:

* Fix errors when using prepared statements with Trino servers newer than 398.
hashhar commented 1 year ago

FYI @mdesmet - note that this is incomplete code - the commit for now just cleans up the HTTP portion (client.py).

dbapi.py changes I'm still working on - but created this to receive some early feedback. In dbapi.py with the new changes I think I don't need to do anything at all other than just executing the PREPARE, EXECUTE and DEALLOCATE since the headers will all be persisted as long as we use same Cursor object.

In a future PR I plan to add additional parameter if user wants prepared statements to be re-used.

hashhar commented 1 year ago

BTW the existing code was buggy even before the Trino change because it didn't get to inspect the initial response from the coordinator because that response's headers were not being set to TrinoResult because the setting of response headers was handled in fetch which only sees the 2nd response and onward.

hashhar commented 1 year ago

applied comments @mdesmet @hovaesco . Please take a look again.

hashhar commented 1 year ago

@ebyhr @hovaesco Added an explicit test for prepared statement + also verified that we don't leak prepared statements in the ClientSession.

mdesmet commented 1 year ago

Python: analysis.........................................................Failed

  • hook id: flake8
  • exit code: 1

tests/integration/test_dbapi_integration.py:1071:1: E302 expected 2 blank lines, found 1

Python: types............................................................Passed Error: Process completed with exit code 1.

That's why I quite like the pre-commit checks to be present as git hooks. Note that if you haven't explicitly installed the pre-commit git hook (running pre-commit install, it wouldn't run on commit. Maybe we should consider activating this? It would save us some CI minutes and a failing build.

cc: @hovaesco, @hashhar