mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.89k stars 562 forks source link

Truncation error with insert pandas dataframe #1062

Closed lime-n closed 2 years ago

lime-n commented 2 years ago

Environment

To diagnose, we usually need to know the following, including version numbers. On Windows, be sure to specify 32-bit Python or 64-bit:

Issue

As with similar issues like #785 and #579, I am having the following error:

('String data, right truncation: length 6 buffer 2', 'HY000')

Reproducible code:

from sqlalchemy import create_engine
import psycopg2 
import pandas as pd

sql_dict = {'change': {1: -45.59009, 2: -2.740005}, 'changePercent': {1: -2.016279, 2: -1.862556}, 'dayVolume': {1: 3122928.0, 2: 54639168.0}, 'exchange': {1: 'NMS', 2: 'NMS'}, 'id': {1: 'AMZN', 2: 'AAPL'}, 'lastSize': {1: 0.0, 2: 200.0}, 'marketHours': {1: 1, 2: 1}, 'price': {1: 2215.51, 2: 144.37}, 'priceHint': {1: 4, 2: 4}, 'quoteType': {1: 8, 2: 8}, 'time': {1: 3305426450000, 2: 3305426452000}, 'currency': {1: '', 2: ''}, 'dayHigh': {1: 0.0, 2: 0.0}, 'dayLow': {1: 0.0, 2: 0.0}, 'circulatingSupply': {1: 0.0, 2: 0.0}, 'fromcurrency': {1: '', 2: ''}, 'lastMarket': {1: '', 2: ''}, 'marketcap': {1: 0.0, 2: 0.0}, 'shortName': {1: '', 2: ''}, 'volAllCurrencies': {1: 0.0, 2: 0.0}, 'vol_24hr': {1: 0.0, 2: 0.0}, 'ask': {1: 0.0, 2: 0.0}, 'bid': {1: 0.0, 2: 0.0}}

sql_data = pd.DataFrame(sql_dict)

user = 'xxxr'
passw = 'xxx'
host =  'xxx.database.windows.net'  
port = 1433 
database = 'xxx'

engine = create_engine(
    'mssql+pyodbc://' + 
    user + ':' + 
    passw + '@' + 
    host + ':' + 
    str(port) + '/' + 
    database +f'?driver=ODBC+Driver+18+for+SQL+Server' , 
    echo=False,
    connect_args={"timeout":30},
                       pool_pre_ping=True, fast_executemany = True)

sql_dataframe[1:5].to_sql('databaseName', con=engine,index=False, if_exists='append')

My SQL table is composed of VARCHAR, FLOAT and BIGINT, for object, float64 and int64 respectively. I have found that it works if I have TEXT instead of VARCHAR and read it only as chunks. If I do not read as chunks then I get a memory issue. However, I want to know how to upload it all in one go. The issue seems to be with VARCHAR, I think.

Example of my table:

CREATE TABLE StocksRealTime (change FLOAT,changePercent FLOAT,dayVolume FLOAT,exchange TEXT,id TEXT,lastSize FLOAT,marketHours BIGINT,price FLOAT,priceHint BIGINT,quoteType BIGINT,time BIGINT,currency TEXT,dayHigh FLOAT,dayLow FLOAT,circulatingSupply FLOAT,fromcurrency TEXT,lastMarket TEXT,marketcap FLOAT,shortName TEXT,volAllCurrencies FLOAT,vol_24hr FLOAT,ask FLOAT,bid FLOAT,)
lime-n commented 2 years ago

Changing VARCHAR to NVARCHAR(255) solved it for me