sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
1.85k stars 146 forks source link

Cannot connect to MSSQL: `Timed out in bb8` #560

Open jmakov opened 6 months ago

jmakov commented 6 months ago

What language are you using?

Python

What version are you using?

0.3.2

What database are you using?

mssql

What dataframe are you using?

Polars

Can you describe your bug?

Establishing a connection with pyodbc works, times out with connector-x

What are the steps to reproduce the behavior?

import urllib

import connectorx

driver = "ODBC Driver 18 for SQL Server"
username = 'myusn'
password = 'my_pass_with_special_chars_e.g._#'
password_url_encoded = urllib.parse.quote_plus(password)  # required because of special chars in passw
server = "192.168.254.254\\MYSQLINSTANCE1"
server_url_encoded = urllib.parse.quote_plus(server)
database = 'MYDB1'
query = "select top 10 * from test.test2"

# this works
connstr = "DRIVER=" + driver + ";" + f"SERVER={server};DATABASE={database};UID={username};PWD={password};ENCRYPT=no"
conn = pyodbc.connect(connstr)

# this times out (also when using `=no` instead of `=false` or without both flags)
connstr2 = f"mssql://{username}:{password_url_encoded}@{server_url_encoded}/{database}?encrypt=false&trusted_connection=false"
connectorx.read_sql(connstr2, query)
Database setup if the error only happens on specific data or data type

Not applicable

Example query / code
"select top 10 * from test.test2"

What is the error?

RuntimeError                              Traceback (most recent call last)
Cell In[102], line 2
      1 connstr2 = f"mssql://{username}:{password_url_encoded}@{server_url_encoded}/{database}?encrypt=false&trusted_connection=false"
----> 2 connectorx.read_sql(connstr2, query)

File ~/mambaforge-pypy3/envs/venv/lib/python3.10/site-packages/connectorx/__init__.py:264, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
    261 except ModuleNotFoundError:
    262     raise ValueError("You need to install pandas first")
--> 264 result = _read_sql(
    265     conn,
    266     "pandas",
    267     queries=queries,
    268     protocol=protocol,
    269     partition_query=partition_query,
    270 )
    271 df = reconstruct_pandas(result)
    273 if index_col is not None:

RuntimeError: Timed out in bb8
nixent commented 6 months ago

Same issue here. I believe it is caused by incorrect handling of connection string when connecting to a specific instance of SQL server. In you case server = "192.168.254.254\\MYSQLINSTANCE1" and database = 'MYDB1', MYSQLINSTANCE1 is not parsed correctly.

lyngc commented 6 months ago

@wangxiaoying This is an ongoing issue that has been reported several times. Is it something you will look into?

nixent commented 6 months ago

@lyngc SQL server uses instance mechanism as kind of proxy for inbound port 1433 to relay it to a port that corresponds to actual instance. You can find "instance" port by running:

USE master
GO
xp_readerrorlog 0, 1, N'Server is listening on', N'any', NULL, NULL, N'asc' 
GO

and then connect to the server:port without specifying instance. It is sort of workaround until the issue with parsing of connection string is fixed.

floriandeutsch89 commented 3 months ago

@lyngc SQL server uses instance mechanism as kind of proxy for inbound port 1433 to relay it to a port that corresponds to actual instance. You can find "instance" port by running:

USE master
GO
xp_readerrorlog 0, 1, N'Server is listening on', N'any', NULL, NULL, N'asc' 
GO

and then connect to the server:port without specifying instance. It is sort of workaround until the issue with parsing of connection string is fixed.

But the port will change when the instance restarts. Hope this will be fixed soon. But it seems like MSSQL is not highest priority.