mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.94k stars 562 forks source link

Insert fail with unicode string "μg" #947

Closed absci closed 3 years ago

absci commented 3 years ago

Please first make sure you have looked at:

Environment

To diagnose, we usually need to know the following, including version numbers. On Windows, be sure to specify 32-bit Python or 64-bit:

Issue

For example, a table was created with this statement.

CREATE TABLE [product] ([id] int NOT NULL PRIMARY KEY IDENTITY (1, 1), [price] int NULL, [unit] nvarchar(15) NULL, CONSTRAINT [unicode_unit_list] CHECK (([unit] IS NULL OR [unit] IN ('μg/mL', 'ng/mL'))))

Insert with pyODBC

import pyodbc

conn = pyodbc.connect('DRIVER=ODBC Driver 17 for SQL Server...')
with conn.cursor() as cursor:
    sql = "INSERT INTO [product] ([price], [unit]) VALUES (?, ?);"
    cursor.execute(sql, (10, 'μg/mL'))

pyodbc.IntegrityError: ('23000', '[23000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The INSERT statement conflicted with the CHECK constraint "unicode_unit_list". The conflict occurred in database "test_unicode", table "dbo.product", column \'unit\'. (547) (SQLExecDirectW)')

The insert will be executed successfully with UTF-8 collation like Latin1_General_100_CI_AI_KS_SC_UTF8. But it may not work for SQL Server < 2019, since those don't have UTF-8 collation.

v-chojas commented 3 years ago

Could you post an ODBC trace?

gordthompson commented 3 years ago

I am able to reproduce the issue

platform: Linux Python version: 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0] pyodbc version: 4.0.32 ODBC driver: libmsodbcsql-17.8.so.1.1 (17.08.0001)

import platform
import sys

import pyodbc

print(f"platform: {platform.system()}")
sys_version = sys.version.replace("\n", "")
print(f"Python version: {sys_version}")
print(f"pyodbc version: {pyodbc.version}")

table_name = "product"

cnxn = pyodbc.connect(
    "DRIVER=ODBC Driver 17 for SQL Server;"
    "SERVER=192.168.0.199;"
    "DATABASE=test;"
    "UseFMTONLY=yes;"
    "UID=scott;PWD=tiger^5HHH;",
    autocommit=True
)
print(
    f"ODBC driver: {cnxn.getinfo(pyodbc.SQL_DRIVER_NAME)}"
    f" ({cnxn.getinfo(pyodbc.SQL_DRIVER_VER)})"
)

crsr = cnxn.cursor()
crsr.execute(f"DROP TABLE IF EXISTS {table_name}")
crsr.execute(f"""\
CREATE TABLE [{table_name}] (
[id] int NOT NULL PRIMARY KEY IDENTITY (1, 1), 
[price] int NULL, 
[unit] nvarchar(15) NULL, 
CONSTRAINT [unicode_unit_list] CHECK (
    ([unit] IS NULL OR [unit] IN ('μg/mL', 'ng/mL'))
    )
)
""")

crsr.execute(
    f"INSERT INTO [{table_name}] ([price], [unit]) VALUES (?, ?);",
    (10, 'μg/mL')
)

odbctrace.log

gordthompson commented 3 years ago

… and the fix is to declare the CHECK constraint using N'literals'

crsr.execute(f"""\
CREATE TABLE [{table_name}] (
[id] int NOT NULL PRIMARY KEY IDENTITY (1, 1), 
[price] int NULL, 
[unit] nvarchar(15) NULL, 
CONSTRAINT [unicode_unit_list] CHECK (
    ([unit] IS NULL OR [unit] IN (N'μg/mL', N'ng/mL'))
    )
)
""")
v-chojas commented 3 years ago

Yes, without the N the character constants will be interpreted in what the current narrow encoding - on the _server- - happens to be, regardless what it appears as on the client.

        Entry:
            Statement = 0x1a2def0
            Param Number = 2
            Param Type = 1
            C Type = -8 SQL_C_WCHAR
            SQL Type = -9 SQL_WVARCHAR
            Col Def = 10
            Scale = 0
            Rgb Value = 0x7f8a3bcc9fe0
            Value Max = 10
            StrLen Or Ind = 0x1a8e6b0
[ODBC][15542][1630510425.866893][SQLBindParameter.c][434]
        Exit:[SQL_SUCCESS]

pyODBC and driver is behaving as expected - parameter is being sent as Unicode.