mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.92k stars 563 forks source link

fractional seconds dropped when using fast_executemany with PostgreSQL ODBC #1023

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hello,

I've been investigating the issues with ETL that is transferring data from MS SQL to PostgreSQL using following environment:

Was able to reproduce issue without sourcing data from MS SQL Server, just mocking some datetimes. datetime2(7) from MS SQL should be saved to timestamp(6) in PosgreSQL with 6 digits in milliseconds.

import datetime
import pyodbc

print('pyodbc version: %s' % pyodbc.version)

destination_pg = "DRIVER={PostgreSQL Unicode};SERVER=localhost;PORT=5432;DATABASE=XXX;UID=XXX;PWD=XXX"
pg_conn = pyodbc.connect(destination_pg, autocommit=True)

pg_cur = pg_conn.cursor()
pg_cur.fast_executemany = True

single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),)

many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),),
        (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)
        ]

print(f"Single = {single}")
print(f"Many = {many}")

pg_cur.execute("INSERT INTO public.data_table VALUES (?)", single)
pg_cur.executemany("INSERT INTO public.data_table VALUES (?)", many)

pg_cur.execute("SELECT datetime_value FROM public.data_table")

inserted = pg_cur.fetchmany(10)
print(f"Inserted = {inserted}")
# pg_cur.execute("truncate table public.data_table")

pg_conn.close()

Actual output:

pyodbc version: 4.0.32 Single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),) Many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)] Inserted = [(datetime.datetime(2021, 1, 1, 11, 21, 11, 311114), ), (datetime.datetime(2022, 2, 3, 17, 28, 15), ), (datetime.datetime(2022, 2, 4, 12, 48, 47), )]

Expected output:

pyodbc version: 4.0.32 Single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),) Many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)] Inserted = [(datetime.datetime(2021, 1, 1, 11, 21, 11, 311114), ), (datetime.datetime(2022, 2, 3, 17, 28, 15, 344014), ), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952), )]

Notice that milliseconds are not saved into PostgreSQL, when executemany is used to save data.

This is how it looks in database:

image

gordthompson commented 2 years ago

I am able to reproduce the issue. fast_executemany = True causes PostgreSQL ODBC to drop fractional seconds

from datetime import datetime

import pyodbc

connection_string = (
    "DRIVER=PostgreSQL Unicode;"
    "SERVER=192.168.0.199;"
    "UID=scott;PWD=tiger;"
    "DATABASE=test;"
)
cnxn = pyodbc.connect(connection_string, autocommit=True)
print(cnxn.getinfo(pyodbc.SQL_DRIVER_VER))  # 13.02.0000
crsr = cnxn.cursor()
crsr.execute("TRUNCATE TABLE data_table")

data = [(datetime(2022, 2, 3, 17, 28, 15, 344014),)]

crsr.fast_executemany = True
crsr.executemany("INSERT INTO data_table (dt) VALUES (?)", data)

print(crsr.execute("SELECT * FROM data_table").fetchall())
# [(datetime.datetime(2022, 2, 3, 17, 28, 15), )]

postgresql.LOG

The fractional seconds are preserved if fast_executemany = False.

Fractional seconds are preserved with fast_executemany = True when using ODBC Driver 17 for SQL Server.

mssql.LOG

A possible explanation for the difference is that PostgreSQL ODBC returns this

gh1023          1824-ae4    EXIT  SQLDescribeParam  with return code 0 (SQL_SUCCESS)
        HSTMT               0x000000B57096CC00
        UWORD                        1 
        SWORD *             0x000000B570A431C2 (93)
        SQLULEN *           0x000000B570A431C8 (26)
        SWORD *             0x000000B570A431D0 (-1)
        SWORD *             0x000000B57085F458 (1)

whereas msodbcsql returns this

gh1023          1094-a18    EXIT  SQLDescribeParam  with return code 0 (SQL_SUCCESS)
        HSTMT               0x0000007F5ABA1180
        UWORD                        1 
        SWORD *             0x0000007F580E2CA2 (93)
        SQLULEN *           0x0000007F580E2CA8 (27)
        SWORD *             0x0000007F580E2CB0 (7)
        SWORD *             0x0000007F5770EF38 (1)
ghost commented 2 years ago

Thanks. fast_executemany = False helps.

According to Driver-support-for-fast_executemany this shouldn't impact speed significantly. One might put a note on that page linking this exceptional case.

gordthompson commented 2 years ago

The wiki page has been updated.

v-chojas commented 2 years ago

https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/decimal-digits "The number of digits to the right of the decimal point in the seconds part of the value (fractional seconds). This number cannot be negative."

Looks like buggy ODBC driver.