jwcook23 / mssql_dataframe

Update, Upsert, and Merge from Python dataframes to SQL Server and Azure SQL database.
MIT License
11 stars 4 forks source link

Cannot use `datetimeoffset` as part of the index #36

Closed ChesoiCarmen closed 1 year ago

ChesoiCarmen commented 1 year ago

Hello @jwcook23 Not sure if i am doing anything wrong, but it seems that i cannot use a timestamp column in the index. However, i can use a simple date. Here are a couple of examples.

This WORKS

import pandas as pd
from mssql_dataframe import SQLServer

sql = SQLServer()
table_name= '##CarmenTest'

sql.create.table(
    table_name,
    columns = {
        "_nchar": "NCHAR(3)",
        "_datetime": "DATETIME",
        "_numeric": "NUMERIC(5,2)"
    },
    primary_key_column=["_nchar", "_datetime"],
)

df = pd.DataFrame({
    "_nchar": pd.Series(['a', 'b']),
    "_datetime": pd.Series([pd.to_datetime('1900-01-01'), pd.to_datetime('1900-01-01')]),
    "_numeric": pd.Series([1.23, 4.56789], dtype="object")
})
df = df.set_index(["_nchar", "_datetime"])

df = sql.write.insert(table_name, df)

result = sql.read.table(table_name)
print(result)

it prints

                  _numeric
_nchar _datetime          
a      1900-01-01     1.23
b      1900-01-01     4.57

However, this DOESN'T WORK

import pandas as pd
from mssql_dataframe import SQLServer

sql = SQLServer()
table_name= '##CarmenTest'

sql.create.table(
    table_name,
    columns = {
        "_nchar": "NCHAR(3)",
        "_datetimeoffset": "DATETIMEOFFSET",
        "_numeric": "NUMERIC(5,2)"
    },
    primary_key_column=["_nchar", "_datetimeoffset"],
)

df = pd.DataFrame({
    "_nchar": pd.Series(['a', 'b']),
    "_datetimeoffset": pd.Series([pd.Timestamp('1900-01-01 00:00:00.123456789+10:30'), pd.Timestamp('1900-01-01 00:00:00.12-9:15')]),
    "_numeric": pd.Series([1.23, 4.56789], dtype="object")
})
df = df.set_index(["_nchar", "_datetimeoffset"])

df = sql.write.insert(table_name, df)

gives an error message

    raise KeyError(key) from err
KeyError: '_datetimeoffset'

because of this line: https://github.com/jwcook23/mssql_dataframe/blob/main/mssql_dataframe/core/conversion.py#L489 In my case col is in the index so when we try to access dataframe[col] it naturally gives the above error.

Is there a workaround or am i doing something wrong ? Using a datetimeoffset column as part of the index is quite a common usecase.

jwcook23 commented 1 year ago

Hi @ChesoiCarmen. I'll put in a fix for this. I think it should be relatively straightforward. I'll use your example as a test case and make it work. Thanks!

jwcook23 commented 1 year ago

@ChesoiCarmen this should be fixed in the new release 2.0.1

ChesoiCarmen commented 1 year ago

hi Jason I'm sorry for replying so late; I was away. Thanks for this. I will give the new version a go.