mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.88k stars 562 forks source link

Slow Connection from Ubuntu to SQL Server (not from Windows) #1147

Closed nileriver23 closed 1 year ago

nileriver23 commented 1 year ago

I am encountering a significant delay when connecting to my SQL Server databases from Ubuntu. It is about 200 ms more than when connecting from Windows (10 ms vs 200 ms). My current setup for comparison is Windows 11 vs. WSL2 on the same machine. The codebase and queries are identical. I have tried both ODBC Driver 18 for SQL Server and FreeTDS.

Here is how I am determining the time it take to connect:

tic = time.perf_counter()
with pyodbc.connect(getConnectStr(driver)) as conn:
    try:

        cursor = conn.cursor()
        toc = time.perf_counter()
        print(f"Connection: {toc - tic:0.4f} seconds")

In my odbc.ini I have: Trace=no Encrypt=no TrustServerCertificate=yes

In odbcinst.ini I have: UsageCount=1 FileUsage=1 Pooling=yes

v-johoang commented 1 year ago

How long does it take when connecting to Windows? Since you are using WSL2 which uses virtualization, I would not expect it to perform as well as running it natively in Windows.

nileriver23 commented 1 year ago

How long does it take when connecting to Windows? Since you are using WSL2 which uses virtualization, I would not expect it to perform as well as running it natively in Windows.

Windows is about 10 ms and WSL2 is about 200 ms.

v-johoang commented 1 year ago

I timed the duration it takes to connect on Oracle VirtualBox VM Ubuntu and a Windows and they were about the same time using your connection string settings. This appears to be WSL2 just being slow, and not a pyODBC issue.

nileriver23 commented 1 year ago

I have also tested it from a brand new Ubuntu VM on Azure, as well as an Azure App Service. Both connected to a SQL Server on an Azure VM and a SQL Server Managed Instance. They both were significantly slower than when connecting from Windows machines.

gordthompson commented 1 year ago

I have an old notebook that I dual-boot between Windows 8.1 and Ubuntu 20.04. This test removes the question of comparing native Windows against virtualized Ubuntu. Connecting from Ubuntu does seem to be significantly slower, though not 20 times slower as you have observed (more like about 6 times):

on Windows 8.1

> python -m timeit -n 100 -s 'import pyodbc' 'cnxn = pyodbc.connect(\"DSN=mssql_199;UID=scott;PWD=tiger^5HHH\")'
100 loops, best of 5: 13 msec per loop

on Ubuntu 20.04

$ python -m timeit -n 100 -s 'import pyodbc' 'cnxn = pyodbc.connect("DSN=mssql_199;UID=scott;PWD=tiger^5HHH")'
100 loops, best of 5: 82.9 msec per loop

Edit

It occurred to me that Windows enables ODBC connection pooling for "ODBC Driver 17 for SQL Server" while unixODBC does not. After disabling connection pooling on Windows my test results are more like

on Windows 8.1

> python -m timeit -n 100 -s 'import pyodbc' 'cnxn = pyodbc.connect(\"DSN=mssql_199;UID=scott;PWD=tiger^5HHH\")'
100 loops, best of 5: 121 msec per loop

That is, connections on Ubuntu appear to be slightly faster than connections on Windows when a new connection is actually created each time.

v-chojas commented 1 year ago

Can you try isql instead of going through pyODBC? It is always hard to tell what is the cause when there are so many differences in the software, even when the hardware is the same.

gordthompson commented 1 year ago

@v-chojas - I just updated my test results, above.

gordthompson commented 1 year ago

For comparison, here is a test from a Ubuntu 20.04 VirtualBox guest on that same Windows 8.1 host:

on Ubuntu 20.04 (VirtualBox guest)

$ python3 -m timeit -n 100 -s 'import pyodbc' 'cnxn = pyodbc.connect("DSN=mssql_199;UID=scott;PWD=tiger^5HHH")'
100 loops, best of 5: 112 msec per loop
gordthompson commented 1 year ago

@nileriver23 - Does it seem reasonable to you that the difference you were seeing was due to connection pooling being enabled on Windows but not on Linux?

nileriver23 commented 1 year ago

@gordthompson Yes. It looks like after setting up connection pooling in the odbcinst.ini file and setting ansi=True in pyodbc.connect it is working! Thanks for all the help.

odbcinst.ini

[ODBC]
Pooling=Yes

[ODBC Driver 18 for SQL Server]
Description=Microsoft ODBC Driver 18 for SQL Server
Driver=/opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.1.so.2.1
UsageCount=1
CPTimeout=120

Python connection with pyodbc.connect(getConnectStr(driver), ansi=True) as conn: