My MSSQL Server database uses ServerName\InstaceName as host. Passing those values as host to mssql://host:port/db?trusted_connection=true, the Polars read_database() and read_database_uri() methods, throw a "RuntimeError: parse error: invalid domain character." error.
If the MSSQL Server is setup with ServerName only it works just fine.
What are the steps to reproduce the behavior?
If possible, please include a minimal simple example including:
Database setup if the error only happens on specific data or data type
Table schema and example data
Example query / code
import polars as pl
import time
import connectorx as cx
import pyarrow
rdb_type = 'mssql'
server_name = '<servername>\<instancename>'
port = 1433 #usually 1433
database_name = 'AdventureWorksDW2022'
uri = f"{rdb_type}://{server_name}:{port}/{database_name}?trusted_connection=true"
query = """
SELECT ProductKey, DateKey, MovementDate, UnitCost, UnitsIn, UnitsOut, UnitsBalance
FROM AdventureWorksDW2022.dbo.FactProductInventory;
"""
start_time = time.time()
df = pl.read_database_uri(query, uri)# by default Polars uses connectorx as its connection engine
execution_time = (time.time() - start_time)
print(f'Reading data from the FactProductInventory table in the {database_name} database, in MSSQL Server, takes {execution_time} seconds')
What is the error?
Show the error result here.
RuntimeError Traceback (most recent call last)
Cell In[8], line 7
2 query = """
3 SELECT ProductKey, DateKey, MovementDate, UnitCost, UnitsIn, UnitsOut, UnitsBalance
4 FROM AdventureWorksDW2022.dbo.FactProductInventory;
5 """
6 start_time = time.time()
----> 7 df = pl.read_database_uri(query, uri)# by default Polars uses connectorx as its connection engine
8 execution_time = (time.time() - start_time)
10 print(f'Reading data from the FactProductInventory table in the {database_name} database, in MSSQL Server, takes {execution_time} seconds')
What language are you using?
Python.
What version are you using?
0.3.2
What database are you using?
MSSQL
What dataframe are you using?
Polars
Can you describe your bug?
My MSSQL Server database uses ServerName\InstaceName as host. Passing those values as host to mssql://host:port/db?trusted_connection=true, the Polars read_database() and read_database_uri() methods, throw a "RuntimeError: parse error: invalid domain character." error. If the MSSQL Server is setup with ServerName only it works just fine.
What are the steps to reproduce the behavior?
If possible, please include a minimal simple example including:
Database setup if the error only happens on specific data or data type
Table schema and example data
Example query / code
What is the error?
Show the error result here.
RuntimeError Traceback (most recent call last) Cell In[8], line 7 2 query = """ 3 SELECT ProductKey, DateKey, MovementDate, UnitCost, UnitsIn, UnitsOut, UnitsBalance 4 FROM AdventureWorksDW2022.dbo.FactProductInventory; 5 """ 6 start_time = time.time() ----> 7 df = pl.read_database_uri(query, uri)# by default Polars uses connectorx as its connection engine 8 execution_time = (time.time() - start_time) 10 print(f'Reading data from the FactProductInventory table in the {database_name} database, in MSSQL Server, takes {execution_time} seconds')
File ~\AppData\Roaming\Python\Python310\site-packages\polars\io\database.py:450, in read_database_uri(query, uri, partition_on, partition_range, partition_num, protocol, engine, schema_overrides) 447 engine = "connectorx" 449 if engine == "connectorx": --> 450 return _read_sql_connectorx( 451 query, 452 connection_uri=uri, 453 partition_on=partition_on, 454 partition_range=partition_range, 455 partition_num=partition_num, 456 protocol=protocol, 457 schema_overrides=schema_overrides, 458 ) 459 elif engine == "adbc": 460 if not isinstance(query, str):
File ~\AppData\Roaming\Python\Python310\site-packages\polars\io\database.py:486, in _read_sql_connectorx(query, connection_uri, partition_on, partition_range, partition_num, protocol, schema_overrides) 480 except ModuleNotFoundError: 481 raise ModuleNotFoundError( 482 "connectorx is not installed" 483 "\n\nPlease run
pip install connectorx>=0.3.2
." 484 ) from None --> 486 tbl = cx.read_sql( 487 conn=connection_uri, 488 query=query, 489 return_type="arrow2", 490 partition_on=partition_on, 491 partition_range=partition_range, 492 partition_num=partition_num, 493 protocol=protocol, 494 ) 495 return from_arrow(tbl, schema_overrides=schema_overrides)File ~\miniconda3\lib\site-packages\connectorx__init__.py:297, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col) 294 except ModuleNotFoundError: 295 raise ValueError("You need to install pyarrow first") --> 297 result = _read_sql( 298 conn, 299 "arrow2" if return_type in {"arrow2", "polars", "polars2"} else "arrow", 300 queries=queries, 301 protocol=protocol, 302 partition_query=partition_query, 303 ) 304 df = reconstruct_arrow(result) 305 if return_type in {"polars", "polars2"}:
RuntimeError: parse error: invalid domain character