trinodb / trino-python-client

Python client for Trino
Apache License 2.0
328 stars 163 forks source link

Support decimal, date, time, timestamp with time zone and timestamp python types in prepared statements and result sets #160

Closed mdesmet closed 2 years ago

mdesmet commented 2 years ago

Fixes #150 , #32

mdesmet commented 2 years ago

As mentioned in #32, this change will impact the behavior on existing codebases. Should we provide a feature flag for this improved type detection?

mdesmet commented 2 years ago

Time with time zone (utc offset) doesn't really exist as a Python type. It's possible to have time with a timezone, but however without an actual moment in time, it's impossible to know the UTC offset. So I don't think this is something we can support for now.

mdesmet commented 2 years ago

To convert named timezones i have added pytz as dependency. As we support older python versions, it seems easier to just use pytz, which was already a test dependency. Maybe we should make this an optional dependency as it's only required for when the flag is activated. Please advice.

Starting from python 3.9 we could replace this with the standard zoneinfo (however seems still need an additional library on Windows: https://tzdata.readthedocs.io/en/latest/).

john-bodley commented 2 years ago

Thanks for authoring the PR.

I was running into an issue with Apache Superset related to inconsistencies in how the frontend handled timestamps. We leverage Pandas for reading a SQL query into a dataframe and serializing it to JSON.

For context here's difference between MySQL and Trino,

import pandas as pd 
from sqlalchemy import create_engine

pd.read_sql_query(
    sql="SELECT TIMESTAMP('2022-01-01 00:00:00') AS ts", 
    con=create_engine("mysql://root@localhost:3360"),
).to_json()

pd.read_sql_query(
    sql="SELECT TIMESTAMP '2022-01-01 00:00:00' AS ts", 
    con=create_engine("trino://localhost:8080"),
).to_json()

which outputs {"ts":{"0":1640995200000}} (which infers UTC per the EPOCH definition) and {"ts":{"0":"2022-01-01 00:00:00.000"}} (which is in the local timezone) respectively and thus are treated differently in JavaScript:

new Date(1640995200000)
> Sat Jan 01 2022 13:00:00 GMT+1300 (New Zealand Daylight Time)

new Date("2022-01-01 00:00:00.000")
> Sat Jan 01 2022 00:00:00 GMT+1300 (New Zealand Daylight Time)