trinodb / trino-python-client

Python client for Trino
Apache License 2.0
335 stars 168 forks source link

SQLAlchemy dialect support more SQL datatypes #107

Open dungdm93 opened 3 years ago

dungdm93 commented 3 years ago

This is a meta issue that tracks the support of SQL datatypes after #81 merged.

MichaelTiemannOSC commented 2 years ago

TIMESTAMP(6) definitely needs to be supported for pd.to_sql to work with datetype.datetime types (e.g. dtype='datetime64[ns]'). Test case:

# The variable `engine` is a SQLAlchemy engine connected to a Trino/Iceberg database with a `sandbox` schema available for table creation by the test user.
# The variable `connection ` is the engine's connection interface

import math
from datetime import datetime
drop_table = engine.execute(f"drop table if exists sandbox.test_timezone6")
drop_table.fetchall()
test_df = pd.DataFrame(
    {"A": [4.5], "B'C": [math.nan], None: [math.inf], "D": [-math.inf], "E": [datetime(2022, 1, 1)], ":F": [1.0]}
).convert_dtypes()
assert (test_df.dtypes == ['Float64', 'Int64', 'Float64', 'Float64', 'datetime64[ns]', 'Int64']).all()
xdf.to_sql("test_timezone6",
    con=connection,
    schema="sandbox",
    if_exists="append",
    index=False,
    method=osc.TrinoBatchInsert(batch_size=5, verbose=True),
)

This presently fails because Pandas converts the datatime64 data type to TIMESTAMP, which gets converted by SQLAlchemy/DBZPI to TIMESTAMP(3) instead of TIMESTAMP(6).

erikerlandson commented 2 years ago

@MichaelTiemannOSC does #181 resolve this?

MichaelTiemannOSC commented 2 years ago

There are changes to the correct functions in the code, so will test as soon as I clear out some other nuisances that are impeding my daily tech work. I'm optimistic the answer is yes...