Closed hb2638 closed 1 year ago
Can you try adding ;LongAsMax=yes
to your connection string and see if that helps?
Can you try adding
;LongAsMax=yes
to your connection string and see if that helps?
That's a good idea.. Unfortunately it didn't work, I also upgraded to 18.2.2 (shows as 2018.182.02.01 on windows) and it has the same problem.
Your repro code gives the same error for me and I have tried a couple of tricks to avoid it (alas, unsuccessfully), but before I go too far down this rabbit hole I do have one question.
["X" * 1000000000]
or, to re-phrase
["X" * 1_000_000_000]
is 1 billion Unicode characters. SQL Server normally treats one Unicode character as 2 bytes, so that would be 2 billion bytes, or 1.9 GiB, not the "500 MB" you mention in your original post. Is your test case purposely 4 times larger than your usage case?
I just tried your code with ["X" * 250_000_000]
(approximately 500 MB) and it did not fail. I note that the largest signed 32-bit integer is 2,147,483,647, or 2.0 GiB, so I'm wondering if we're hitting some sort of internal limit that causes pyodbc to get confused.
Your repro code gives the same error for me and I have tried a couple of tricks to avoid it (alas, unsuccessfully), but before I go too far down this rabbit hole I do have one question.
["X" * 1000000000]
or, to re-phrase
["X" * 1_000_000_000]
is 1 billion Unicode characters. SQL Server normally treats one Unicode character as 2 bytes, so that would be 2 billion bytes, or 1.9 GiB, not the "500 MB" you mention in your original post. Is your test case purposely 4 times larger than your usage case?
I just tried your code with
["X" * 250_000_000]
(approximately 500 MB) and it did not fail. I note that the largest signed 32-bit integer is 2,147,483,647, or 2.0 GiB, so I'm wondering if we're hitting some sort of internal limit that causes pyodbc to get confused.
2GB is within the range of what's allowed for NVARCHAR(MAX) which is why the .NET version of the code works.
Your repro code gives the same error for me and I have tried a couple of tricks to avoid it (alas, unsuccessfully), but before I go too far down this rabbit hole I do have one question.
["X" * 1000000000]
or, to re-phrase
["X" * 1_000_000_000]
is 1 billion Unicode characters. SQL Server normally treats one Unicode character as 2 bytes, so that would be 2 billion bytes, or 1.9 GiB, not the "500 MB" you mention in your original post. Is your test case purposely 4 times larger than your usage case?
I just tried your code with
["X" * 250_000_000]
(approximately 500 MB) and it did not fail. I note that the largest signed 32-bit integer is 2,147,483,647, or 2.0 GiB, so I'm wondering if we're hitting some sort of internal limit that causes pyodbc to get confused.
The string in the code that fauled was 538,173,956 characters (UTF-8). We're uploading JSON as NVARCHAR(MAX) as an alternative to TPV.
A 536,870,911 character length string works but a 536,870,912 character length string does not work.
And good catch on me missing the conversion of chars to btyes... so it's about 1GB of text that causes it to fail.
A 536,870,911 character length string works but a 536,870,912 character length string does not work.
... so it's about 1GB of text that causes it to fail.
That's exactly the point where it exceeds 1 GiB, assuming 2 bytes per character. However, UTF-16 encoding can use 4 bytes (surrogate pairs) for characters outside the Basic Multilingual Plane (e.g., emoji) so pyodbc may be "playing it safe" and allowing for 4 bytes per character, which would just hit the 2 GiB limit.
pyodbc does not cause the error when using FreeTDS ODBC 1.3.6 and unixODBC 2.3.9 on Linux.
import pyodbc
cnxn = pyodbc.connect(
"Driver=FreeTDS;"
"Server=192.168.0.199;Port=1433;"
"UID=scott;PWD=tiger^5HHH;"
"Database=test;"
)
crsr = cnxn.cursor()
crsr.execute(
"DECLARE @FOO NVARCHAR(MAX) = ?",
["X" * 600_000_000]
)
[ODBC][3312][1689342866.647757][SQLBindParameter.c][217]
Entry:
Statement = 0x56273645f8c0
Param Number = 1
Param Type = 1
C Type = -8 SQL_C_WCHAR
SQL Type = -10
Col Def = 1200000000
Scale = 0
Rgb Value = 0x562736462d20
Value Max = 8
StrLen Or Ind = 0x562736462d48
[ODBC][3312][1689342866.689437][SQLBindParameter.c][434]
Exit:[SQL_SUCCESS]
pyodbc does not cause the error when using FreeTDS ODBC 1.3.6 and unixODBC 2.3.9 on Linux.
import pyodbc cnxn = pyodbc.connect( "Driver=FreeTDS;" "Server=192.168.0.199;Port=1433;" "UID=scott;PWD=tiger^5HHH;" "Database=test;" ) crsr = cnxn.cursor() crsr.execute( "DECLARE @FOO NVARCHAR(MAX) = ?", ["X" * 600_000_000] )
[ODBC][3312][1689342866.647757][SQLBindParameter.c][217] Entry: Statement = 0x56273645f8c0 Param Number = 1 Param Type = 1 C Type = -8 SQL_C_WCHAR SQL Type = -10 Col Def = 1200000000 Scale = 0 Rgb Value = 0x562736462d20 Value Max = 8 StrLen Or Ind = 0x562736462d48 [ODBC][3312][1689342866.689437][SQLBindParameter.c][434] Exit:[SQL_SUCCESS]
Thx. This inspired me to check the SQLBindParameter for the logs I attached and I see the SQLULEN and SQLLEN are different. SQLULEN: PYODBC sends 2,000,000,000 and Microsoft's C# implementation sends 1,000,000,000. SQLLEN: PYODBC sends 8 and Microsoft's C# implementation sends 2,000,000,002.
Why the differences???
ConsoleApp1 a68-36d4 ENTER SQLBindParameter
HSTMT 0x0000015BFD991CC0
UWORD 1
SWORD 1
ConsoleApp1 a68-36d4 EXIT SQLBindParameter with return code 0 (SQL_SUCCESS)
test_pyodbc 8ba4-3970 ENTER SQLBindParameter
HSTMT 0x0000019BD6920F60
UWORD 1
SWORD 1
test_pyodbc 8ba4-3970 EXIT SQLBindParameter with return code -1 (SQL_ERROR)
Ignore the second-to-last parameter; that's only used for output parameters.
But a column size of 2000000000 is certainly far beyond the limit for ntext, which is what SQL_WLONGVARCHAR Is mapped to: https://learn.microsoft.com/en-us/sql/t-sql/data-types/ntext-text-and-image-transact-sql?view=sql-server-ver16
The fact that it is twice as large as it should be suggests another bytes vs characters confusion; you may try to use setinputsizes() to override the column size.
Yep,
crsr.setinputsizes([(pyodbc.SQL_WVARCHAR, 0)])
crsr.execute(
"DECLARE @FOO NVARCHAR(MAX)=?",
["X" * 600_000_000]
)
seems to work. Note that's SQL_WVARCHAR, not SQL_WLONGVARCHAR.
Yep,
crsr.setinputsizes([(pyodbc.SQL_WVARCHAR, 0)]) crsr.execute( "DECLARE @FOO NVARCHAR(MAX)=?", ["X" * 600_000_000] )
seems to work. Note that's SQL_WVARCHAR, not SQL_WLONGVARCHAR.
Thanks, that fixes it. Is there any reason why I shouldn't use pyodbc.SQL_WVARCHAR for all strings I send to Microsoft SQL Server via pyodbc?
Please first make sure you have looked at:
Environment
To diagnose, we usually need to know the following, including version numbers. On Windows, be sure to specify 32-bit Python or 64-bit: DOTNET.SQL.LOG PYODBC.SQL.LOG
Issue
When uploading large text (about 500MB in my case), pyodbc fails with pyodbc.Error: ('HY104', '[HY104] [Microsoft][ODBC Driver 18 for SQL Server]Invalid precision value (0) (SQLBindParameter)')
Below is python code that replicates the problem as well as equivalent c# code that doesn't replicate the problem. I've attached the ODBC trace logs for both. The expected behavior is it shouldn't fail.