trinodb / trino-python-client

Python client for Trino
Apache License 2.0
327 stars 163 forks source link

Add lazy evaluation of `server_version_info` #371

Closed mdesmet closed 1 year ago

mdesmet commented 1 year ago

Description

Non-technical explanation

Release notes

( ) This is not user-visible or docs only and no release notes are required. ( ) Release notes are required, please propose a release note for me. ( ) Release notes are required, with the following suggested text:

* Fix some things. ({issue}`issuenumber`)
hashhar commented 1 year ago

This is something that the SQLA engine interface provides so user scripts/programs can make use of it. Core itself doesn't use it other than for populating a server_version_info attribute.

Some dialects also use it internally for version dependent logic, an example is Oracle - https://github.com/zzzeek/sqlalchemy/blob/671647c8574bd6ff4c39fe503a2b1958a802772d/lib/sqlalchemy/dialects/oracle/base.py#L1487

hashhar commented 1 year ago

How much does this really help, it's only called once per connection. So only if someone is using NullPool or creating new engine everytime.

mdesmet commented 1 year ago

Well, it is called once per connection and not used in 99.999% of the cases, so making it lazily evaluated makes definitely a difference as it removes these series of requests from EVERY sqlalchemy usage.

hashhar commented 1 year ago

tests are failing

hashhar commented 1 year ago

Well, it is called once per connection and not used in 99.999% of the cases, so making it lazily evaluated makes definitely a difference as it removes these series of requests from EVERY sqlalchemy usage.

I was thinking in terms of number of requests. This function isn't executed per-query, so it only reduces 1 call for entire application/connection (which is generally much lower than number of queries).

i.e. for someone who uses only a few connections there is no benefit of this lazy evaluation.

nineinchnick commented 1 year ago

When working with a server that's geographically far away, or over a bad network connection, the overhead of the extra query can be significant, especially when executing short/simple queries. IMO, it is very valuable to avoid this unnecessary query.

nineinchnick commented 1 year ago

Note that I tested today that getting the server version info is done only once after creating the engine, even if connections are not reused (using a NullPool and closing the connection after every query).