blue-yonder / turbodbc

Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
http://turbodbc.readthedocs.io/en/latest
MIT License
623 stars 85 forks source link

[MSSQL] Cursor/connection scope issues #90

Open dirkjonker opened 7 years ago

dirkjonker commented 7 years ago

Multiple sequential commands using the same cursor object do not seem to work as expected. The below examples work fine with pyodbc.

Example 1: local temporary tables

(a local temp table starts with # and is only visible to the session that creates it)

In [19]: cur.execute("CREATE TABLE #test_some (val NVARCHAR(100));")
Out[19]: <turbodbc.cursor.Cursor at 0x111a0efd0>

In [20]: cur.execute("INSERT INTO #test_some (val) VALUES (?)", ("bar",))
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/exceptions.py in wrapper(*args, **kwds)
     34         try:
---> 35             return f(*args, **kwds)
     36         except InternError as e:

/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/cursor.py in execute(self, sql, parameters)
     71             buffer.add_set(parameters)
---> 72             buffer.flush()
     73         self.impl.execute()

Error: ODBC error
state: 42S02
native error code: 208
message: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name '#test_some'.

During handling of the above exception, another exception occurred:

DatabaseError                             Traceback (most recent call last)
<ipython-input-20-725d00a6bff6> in <module>()
----> 1 cur.execute("INSERT INTO #test_some (val) VALUES (?)", ("bar",))

/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/exceptions.py in wrapper(*args, **kwds)
     35             return f(*args, **kwds)
     36         except InternError as e:
---> 37             raise DatabaseError(str(e))
     38     return wrapper

DatabaseError: ODBC error
state: 42S02
native error code: 208
message: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name '#test_some'.

In [21]: 

Example 2: identity insert

To be able to insert a value in an identity column (like an auto-increment column) you need to explicitly set IDENTITY_INSERT on that specific table to ON

cur.execute("SET IDENTITY_INSERT ##test_some ON")
Out[38]: <turbodbc.cursor.Cursor at 0x111a0efd0>

In [39]: cur.execute("INSERT INTO ##test_some (val) VALUES (?)", (1,))
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/exceptions.py in wrapper(*args, **kwds)
     34         try:
---> 35             return f(*args, **kwds)
     36         except InternError as e:

/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/cursor.py in execute(self, sql, parameters)
     71             buffer.add_set(parameters)
---> 72             buffer.flush()
     73         self.impl.execute()

Error: ODBC error
state: 23000
native error code: 544
message: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Cannot insert explicit value for identity column in table '##test_some' when IDENTITY_INSERT is set to OFF.

During handling of the above exception, another exception occurred:

DatabaseError                             Traceback (most recent call last)
<ipython-input-39-ca28f00caca6> in <module>()
----> 1 cur.execute("INSERT INTO ##test_some (val) VALUES (?)", (1,))

/Users/Dirkjonker/.pyenv/versions/3.5.2/envs/turbo/lib/python3.5/site-packages/turbodbc/exceptions.py in wrapper(*args, **kwds)
     35             return f(*args, **kwds)
     36         except InternError as e:
---> 37             raise DatabaseError(str(e))
     38     return wrapper

DatabaseError: ODBC error
state: 23000
native error code: 544
message: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Cannot insert explicit value for identity column in table '##test_some' when IDENTITY_INSERT is set to OFF.

I'll try investigating the ODBC trace, but did not find anything special at first glance.

MathMagique commented 7 years ago

Hm, perhaps this has something to do with missing autocommits? The operations seem to involve the Data Definition Language part of SQL, and some databases require commits after such operations to become effective. I'll try to check things tomorrow.

MathMagique commented 7 years ago
>>> cur.execute("CREATE TABLE test (a integer)")
<turbodbc.cursor.Cursor object at 0x102587950>
>>> cur.execute("INSERT INTO test VALUES (?)", [42])
<turbodbc.cursor.Cursor object at 0x102587950>
>>> cur.execute("CREATE TABLE #test (a integer)")
<turbodbc.cursor.Cursor object at 0x102587950>
>>> cur.execute("INSERT INTO #test VALUES (?)", [42])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "turbodbc/exceptions.py", line 51, in wrapper
    raise DatabaseError(str(e))
turbodbc.exceptions.DatabaseError: ODBC error
state: 42S02
native error code: 208
message: [FreeTDS][SQL Server]Invalid object name '#test'.

Things work for a regular table, but not for one with the fancy temporary name thing I have never seen before. Hm...

keitherskine commented 7 years ago

Just a quick suggestion (apologies for not trying this out myself first), if you commit() after creating the temporary table and before the insert, does the insert then work?

Also, what happens if you try global temporary tables (which are visible to everybody, not just the session owner)? That is use ##test instead of #test.

MathMagique commented 7 years ago

Okay, I looked through the trace outputs. I think the difference is that turbodbc closes the statement handle between calls to execute(), while pyodbc merely calls SQLFreeStmt(). That may render the temporary table void.

@keitherskine It works with ##test. Commit has no effect :-(

MathMagique commented 7 years ago

@dirkjonker Could you please split off example number 2 into a different issue, so I don't accidently ignore it?

dirkjonker commented 7 years ago

Yes, I see the same behavior in the odbc trace! And about the other issue, I just created #93

MathMagique commented 6 years ago

Hi! Sorry for the long delay. I did some further digging, using a pure ODBC approach, no turbodbc involved at all. Here is a C++ program:

#include <sql.h>
#include <sqlext.h>

#include <sstream>

std::string to_string(SQLRETURN code)
{
    switch (code) {
        case SQL_SUCCESS: return "SUCCESS";
        case SQL_SUCCESS_WITH_INFO: return "SUCCESS_WITH_INFO";
        case SQL_ERROR: return "ERROR";
        case SQL_NO_DATA: return "NO DATA";
        default:
            std::ostringstream output;
            output << code;
            return output.str();
    }
}

void print_error(SQLSMALLINT handle_type, SQLHANDLE & handle)
{
    SQLCHAR status_code[6];
    SQLINTEGER native_error = 0;
    SQLCHAR message[4097];
    SQLSMALLINT message_length = 0;

    SQLRETURN result = 0;
    result = SQLGetDiagRec(handle_type, handle, 1, status_code, &native_error, message, 4096, &message_length);
    std::cout << "SQLGetDiagRec(): " << to_string(result) << std::endl;
    if ((result == SQL_SUCCESS) || (result == SQL_SUCCESS_WITH_INFO)){
        std::cout << "   Error: " << status_code << " (" << native_error << "): " << message << std::endl;
    } else if (result == SQL_INVALID_HANDLE) {
        std::cout << "  INVALID HANDLE" << std::endl;
    } else if (result == SQL_ERROR) {
        std::cout << "  ERROR" << std::endl;
    } else {
        std::cout << "  NO DATA: " << status_code << std::endl;
    }
}

SQLCHAR * to_pointer(std::string const & s)
{
    return reinterpret_cast<SQLCHAR *>(const_cast<char *>(s.c_str()));
}

int main()
{
    std::cout << "\n\n\n*****************************" << std::endl;
    SQLRETURN result = 0;
    SQLHENV environment_handle;

    result = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &environment_handle);
    std::cout << "SQLAllocHandle(): " << to_string(result) << std::endl;

    result = SQLSetEnvAttr(environment_handle, SQL_ATTR_ODBC_VERSION, reinterpret_cast<SQLPOINTER>(SQL_OV_ODBC3), 0);
    std::cout << "SQLSetEnvAttr(environment): " << to_string(result) << std::endl;

    SQLHDBC connection_handle = 0;
    result = SQLAllocHandle(SQL_HANDLE_DBC, environment_handle, &connection_handle);
    std::cout << "SQLAllocHandle(connection): " << to_string(result) << std::endl;

    std::string connection_string("dsn=MSSQL;uid=user;pwd=password");
    SQLCHAR out_connection_string[1025];
    SQLSMALLINT out_connection_string_size = 0;
    result = SQLDriverConnect(connection_handle, SQL_NULL_HANDLE, reinterpret_cast<SQLCHAR*>(const_cast<char *>(connection_string.c_str())), connection_string.size(), out_connection_string, 1024, &out_connection_string_size, SQL_DRIVER_NOPROMPT);
    std::cout << "SQLDriverConnect(): " << to_string(result) << std::endl;

    SQLHSTMT statement_handle = 0;
    result = SQLAllocHandle(SQL_HANDLE_STMT, connection_handle, &statement_handle);
    std::cout << "SQLAllocHandle(statement): " << to_string(result) << std::endl;

    std::string const create_query("CREATE TABLE #temp_table (A integer)");

    // ****** BLOCK A *****
    result = SQLExecDirect(statement_handle, to_pointer(create_query), create_query.size());
    std::cout << "SQLExecDirect('" << create_query << "'): " << to_string(result) << std::endl;

    // ****** BLOCK B *****
    result = SQLPrepare(statement_handle, to_pointer(create_query), create_query.size());
    std::cout << "SQLPrepare('" << create_query << "'): " << to_string(result) << std::endl;
    result = SQLExecute(statement_handle);
    std::cout << "SQLExecute(statement): " << to_string(result) << std::endl;

    std::string const select_query("SELECT * FROM #temp_table");
    result = SQLExecDirect(statement_handle, to_pointer(select_query), select_query.size());
    std::cout << "SQLExecDirect('" << select_query << "'): " << to_string(result) << std::endl;
    print_error(SQL_HANDLE_STMT, statement_handle);

    result = SQLFreeHandle(SQL_HANDLE_STMT, statement_handle);
    std::cout << "SQLFreeHandle(statement): " << to_string(result) << std::endl;

    result = SQLDisconnect(connection_handle);
    std::cout << "SQLDisconnect(connection): " << to_string(result) << std::endl;

    result = SQLFreeHandle(SQL_HANDLE_DBC, connection_handle);
    std::cout << "SQLFreeHandle(connection): " << to_string(result) << std::endl;

    result = SQLFreeHandle(SQL_HANDLE_ENV, environment_handle);
    std::cout << "SQLFreeHandle(environment): " << to_string(result) << std::endl;
    return 0;
}

It is not pretty, I know. Error handling is sprinkled where I needed it. The important thing is the presence of the Block A and Block B instructions. If I run the program with Block A (Block B commented out), I get the following output:

*****************************
SQLAllocHandle(): SUCCESS
SQLSetEnvAttr(environment): SUCCESS
SQLAllocHandle(connection): SUCCESS
SQLDriverConnect(): SUCCESS
SQLAllocHandle(statement): SUCCESS
SQLExecDirect('CREATE TABLE #temp_table (A integer)'): NO DATA
SQLExecDirect('SELECT * FROM #temp_table'): SUCCESS
SQLGetDiagRec(): NO DATA
  NO DATA: 00000
SQLFreeHandle(statement): SUCCESS
SQLDisconnect(connection): SUCCESS
SQLFreeHandle(connection): SUCCESS
SQLFreeHandle(environment): SUCCESS

All is well. If I run the program with Block B (Block A commented out), I get the following output:

*****************************
SQLAllocHandle(): SUCCESS
SQLSetEnvAttr(environment): SUCCESS
SQLAllocHandle(connection): SUCCESS
SQLDriverConnect(): SUCCESS
SQLAllocHandle(statement): SUCCESS
SQLPrepare('CREATE TABLE #temp_table (A integer)'): SUCCESS
SQLExecute(statement): NO DATA
SQLExecDirect('SELECT * FROM #temp_table'): ERROR
SQLGetDiagRec(): SUCCESS
   Error: 42S02 (208): [FreeTDS][SQL Server]Invalid object name '#temp_table'.
SQLFreeHandle(statement): SUCCESS
SQLDisconnect(connection): SUCCESS
SQLFreeHandle(connection): SUCCESS
SQLFreeHandle(environment): SUCCESS

This one fails with the same error message as turbodbc. The issue lies in that Block A uses SQLExecDirect() to execute the create statement directly without preparing it, while Block B uses SQLPrepare() to prepare the query followed by SQLExecute() to execute the prepared query.

Honestly, I do not think that this is sane behavior on the driver side (FreeTDS in my example). I'd rather report this upstream than have turbodbc maintaining code paths for statements that require parameters and preparing, and those that do not.

@dirkjonker What is your take on this?

dirkjonker commented 6 years ago

Thanks @MathMagique for the investigation. Indeed it looks like odd behavior, I will take a closer look later today. The code example is very useful, I'm not a C++ developer but I can work with that easily.

dirkjonker commented 6 years ago

So far I can't find any documented reason why the behavior for SQLExecDirect would be different from a Prepare and then Execute...

MathMagique commented 6 years ago

Oddly enough, both versions work if you use ##temp_table.

keitherskine commented 6 years ago

I believe this is probably SQL Server behaviour, rather than a driver quirk. In SQL Server, tables prefixed with "#" are visible only within a session. When the session expires, the table is dropped. Tables prefixed with "##" are visible across sessions. When SQL Server executes a parameterised query, SQL Server wraps that query within a session. Hence when that query completes, the session expires, and any "#" tables expire with it. As follows:

import pyodbc
conn = pyodbc.connect('...', autocommit=True)
cursor = conn.cursor()

cursor.execute("SELECT CAST(1 as int) AS A INTO #t1")  # succeeds
cursor.execute("SELECT * FROM #t1").fetchall()  # this query succeeds

cursor.execute("SELECT CAST(? as int) AS A INTO #t2", (1,))  # apparently succeeds
cursor.execute("SELECT * FROM #t2").fetchall()  # fails with: pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][SQL Server Native Client 10.0][SQL Server]Invalid object name '#t2'. (208) (SQLExecDirectW)")

cursor.execute("SELECT CAST(? as int) AS A INTO ##t3", (1,))
cursor.execute("SELECT * FROM ##t3").fetchall()  # succeeds

Table #t2 is dropped as soon as the query completes, hence the "select" query fails. Querying ##t3 is fine though.

MathMagique commented 6 years ago

Hm, even though it sucks, I could use the presence of parameters to use either direct execute or prepared execution. It is odd behavior on MSSQL side, though :joy:

dirkjonker commented 6 years ago

Still, the "local" temporary table (prefixed with a single # is supposed to be visible throughout the duration of the connection:

Local temporary tables have a single number sign (#) as the first character of their names; they are visible only to the current connection for the user, and they are deleted when the user disconnects from the instance of SQL Server. Source: https://docs.microsoft.com/en-us/sql/relational-databases/tables/tables

I see the same behavior with the Microsoft ODBC driver, so I'm not sure if it is a driver problem, perhaps it is related to SQL Server itself.

Based on the code example, I tried (combinations of) the following, without success:

At this point I'm not quite sure what would be the best way to handle this. Maybe give the user the option to run only direct executions? As Turbodbc is designed for large buffers, there may not be many people that benefit from preparing and executing the same query many times with different parameters. Still, that may only be useful for this particular issue. Not sure if there would be any other advantages.

Let's see if we can find someone with some deeper knowledge of ODBC/SQL Server to clarify this issue.