mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.88k stars 562 forks source link

Byte array data from sybase database using pyodbc was consistently truncated at exactly 32768 bytes #1226

Closed mohamedamine92 closed 1 year ago

mohamedamine92 commented 1 year ago

I have PDF files stored as image datatype (byte array data as mentioned in the doc) in a sybase database table. I am trying to read one of those files from the db and write it to a file in a local folder using python pyodbc package like this example :

 driver = "FreeTDS"
 prt = 'port'
 db = 'db'
 passwd = 'passwd'
 usr = 'usr'
 serv = 'serv'
 conn = pyodbc.connect(driver=driver, server=serv, port=prt, uid=usr, pwd=passwd)
 sql_query = (
    "SELECT ARCH_DOC_DOC as file_content FROM table_name WHERE ARCH_DOC_ID = id"
 )
 cursor = conn.cursor()
 cursor.execute(sql_query)
 pdf_data = cursor.fetchone()[0]
 with open('my_test_file.pdf', 'wb') as f:
     f.write(pdf_data)

I am using TDS driver 1.2.3, pyodbc 4.0.35 and running this code on Debian GNU/Linux 11 machine. The problem is that byte array retrieved by pyodbc was consistently truncated at exactly 32768 bytes. So for example, the original file size that I am using to test is 90ko in the db and the file I am getting is only 33ko.

The issue was reproduced with unixODBC 2.3.6, FreeTDS 1.2.3, and pyodbc 4.0.39. Upgrading to the latest version of FreeTDS (1.3.18) did not resolve the issue. Neither did upgrading to the latest version of unixODBC (2.3.11). Building FreeTDS without --enable-sybase-compat also didn't fix it, nor did using SAP's ODBC driver for ASE on Linux (libsybdrvodb-sqllen8.so).

gordthompson commented 1 year ago

Repro code using SAP ASE ODBC driver for linux (libsybdrvodb-sqllen8.so), unixODBC 2.3.7, and pyodbc 4.0.39 on CentOS_7:

import pyodbc

cnxn = pyodbc.connect(
    "Driver=Sybase;"
    "Server=192.168.0.199;Port=5000;"
    "UID=sa;PWD=myPassword;"
    "Database=mydb"
)
crsr = cnxn.cursor()
print(crsr.execute(
    "SELECT datalength(ARCH_DOC_DOC) AS dl FROM so76408133 WHERE ARCH_DOC_ID = 1"
).fetchval())
# 60343

pdf_data = crsr.execute(
    "SELECT ARCH_DOC_DOC FROM so76408133 WHERE ARCH_DOC_ID = 1"
).fetchval()
print(len(pdf_data))
# 32768

odbctrace.log

gordthompson commented 1 year ago

A workaround using JayDeBeApi and jTDS is described in the Stack Overflow answer here.

v-chojas commented 1 year ago
[ODBC][3752][1686144472.304716][SQLGetData.c][237]
        Entry:
            Statement = 0x1b16f80
            Column Number = 1
            Target Type = -2 SQL_C_BINARY
            Buffer Length = 4096
            Target Value = 0x1b48780
            StrLen Or Ind = 0x7fff02018b28
[ODBC][3752][1686144472.304997][SQLGetData.c][501]
        Exit:[SQL_SUCCESS_WITH_INFO]                
            Buffer = [BINARYDATA...]                
            Strlen Or Ind = 0x7fff02018b28 -> 32768
        DIAG [01004] [SAP][ASE ODBC Driver]String data, right truncated

[ODBC][3752][1686144472.305150][SQLGetData.c][237]
        Entry:
            Statement = 0x1b16f80
            Column Number = 1
            Target Type = -2 SQL_C_BINARY
            Buffer Length = 28672
            Target Value = 0x1b4af70
            StrLen Or Ind = 0x7fff02018b28
[ODBC][3752][1686144472.305259][SQLGetData.c][501]
        Exit:[SQL_SUCCESS]                
            Buffer = [BINARYDATA...]                
            Strlen Or Ind = 0x7fff02018b28 -> 28672
[ODBC][3752][1686144472.310850]

ODBC driver bug.

" If it is called more than once in succession for the same column, each call returns a successive part of the data. Each call except the last call returns SQL_SUCCESS_WITH_INFO and SQLSTATE 01004 (String data, right truncated); the last call returns SQL_SUCCESS."

https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/getting-long-data?view=sql-server-ver16

v-chojas commented 1 year ago

Driver also claims the total length is 32k on the first SQLGetData call.

gordthompson commented 1 year ago

Seeing as this happens for both the SAP ASE ODBC driver and the FreeTDS ODBC driver, I suspect that it is more of a Sybase bug than a driver bug.

gordthompson commented 1 year ago

Ah, okay, so what we should be seeing is

[ODBC][3752][1686144472.304716][SQLGetData.c][237]
        Entry:
            Statement = 0x1b16f80
            Column Number = 1
            Target Type = -2 SQL_C_BINARY
            Buffer Length = 4096
            Target Value = 0x1b48780
            StrLen Or Ind = 0x7fff02018b28
[ODBC][3752][1686144472.304997][SQLGetData.c][501]
        Exit:[SQL_SUCCESS_WITH_INFO]                
            Buffer = [BINARYDATA...]                
            Strlen Or Ind = 0x7fff02018b28 -> 60343
        DIAG [01004] [SAP][ASE ODBC Driver]String data, right truncated

[ODBC][3752][1686144472.305150][SQLGetData.c][237]
        Entry:
            Statement = 0x1b16f80
            Column Number = 1
            Target Type = -2 SQL_C_BINARY
            Buffer Length = 56247
            Target Value = 0x1b4af70
            StrLen Or Ind = 0x7fff02018b28
[ODBC][3752][1686144472.305259][SQLGetData.c][501]
        Exit:[SQL_SUCCESS]                
            Buffer = [BINARYDATA...]                
            Strlen Or Ind = 0x7fff02018b28 -> 56247
[ODBC][3752][1686144472.310850]
mohamedamine92 commented 1 year ago

A workaround using JayDeBeApi and jTDS is described in the Stack Overflow answer here.

This solution fixed the issue for me. Thanks @gordthompson !