Closed Pivert closed 1 year ago
( probably related: #376 )
What do you see when you run this query?
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'bulk' AND COLUMN_NAME IN ('FILENAME', 'COMPANY')
db BULK FILENAME varchar 50 db BULK COMPANY varchar 50
I confirm that encoding the strings in utf-8 as a workaround mentioned in #376 works like a charm.
Back to super performances. Thanks for the workaround !
Note that encoding the string parameters as UTF-8 can cause queries to fail for varchar() columns that contain accented characters and other non-ASCII characters. For example:
# example data
sql = "SELECT ID, COMPANY FROM [bulk]"
results = crsr.execute(sql).fetchall()
print(results) # [(1, 'fermé')]
company = results[0][1]
# string parameter encoded as UTF-8
sql = "SELECT ID FROM [bulk] WHERE COMPANY=?"
results = crsr.execute(sql, company.encode("utf-8")).fetchall()
print(results) # [] ← not found!
Better workarounds are to use CAST() in T-SQL …
sql = "SELECT ID FROM [bulk] WHERE COMPANY=CAST(? AS varchar(10))"
results = crsr.execute(sql, company).fetchall()
print(results) # [(1, )]
… or setinputsizes()
sql = "SELECT ID FROM [bulk] WHERE COMPANY=?"
crsr.setinputsizes([(pyodbc.SQL_VARCHAR, 10)])
results = crsr.execute(sql, company).fetchall()
print(results) # [(1, )]
Environment
Issue
observed behavior : Parameterized queries do not use the index and are dead slow expected behavior : Parameterized queries should be a bit faster than raw ones
The same RAW query properly uses the index and is 1000x faster.
Example code
Results
P.S.: The table contains 6M records, and as an index with the 2 query fields, and the ID field as additional data field.