sfu-db / connector-x

Fastest library to load data from DB to DataFrames in Rust and Python
https://sfu-db.github.io/connector-x
MIT License
2k stars 162 forks source link

Trusted_Connection or mssql+pyodbc not supported #233

Open Arcturus122 opened 2 years ago

Arcturus122 commented 2 years ago

Hi,

Working with pandas read_sql the uri below works fine

"mssql+pyodbc://:/?driver=ODBC+Driver+17+for+SQL+Server&Trusted_Connection=Yes"

but with connectorX i get this error :

here is the code :

cx.read_sql("mssql+pyodbc://<server>:<port>/<db_name>?driver=ODBC+Driver+17+for+SQL+Server&Trusted_Connection=Yes","select * from dbo.test_table")
panic exception not implemented
mssql+pyodbc://<server>:<port>/<db_name>?driver=ODBC+Driver+17+for+SQL+Server&Trusted_Connection=Yes not supported !

Would it be possible to add this feature soon ?

Many thanks in advance

wangxiaoying commented 2 years ago

Hi @Arcturus122 , we do not support odbc for now (WIP). If you want to connect to mssql using trusted_connection, please use mssql://host:port/db?trusted_connection=true directly (don't need to configure / install anything).

miroslaavi commented 1 year ago

Hi, I'm trying to get connectorx working with trusted connection with MsSQL. With SQL Alchemy the connection works fine with the below connection string:

conn= f"mssql+pyodbc://{server_name}/{database_name}?trusted_connection=yes&driver=ODBC+Driver+17+for+SQL+Server"

I've tried the below but all ends up to "RuntimeError: Timed out in bb8".

conn = f"mssql://{server_name}/{database_name}?encrypt=true&trusted_connection=true"

conn = f"mssql://{server_name}/{database_name}?encrypt=true&trusted_connection=true&driver=ODBC+Driver+17+for+SQL+Server"

conn= f"mssql+pyodbc://{server_name}/{database_name}?trusted_connection=yes&driver=ODBC+Driver+17+for+SQL+Server"

Am I doing something wrong or is the trusted connection currently not working with MsSQL?

mitches-got-glitches commented 1 year ago

I'm having this same issue. The URI that's working for pandas and SQL Alchemy isn't working for polars.

It's a shame because I really want to get it loading from SQL server on our project as we've got some pandas bottlenecks.

We typically connect with pyodbc, and I think we use Kerberos as our authenticator.

@wangxiaoying any news on when this feature/bugfix might be implemented?

miroslaavi commented 1 year ago

I'm having this same issue. The URI that's working for pandas and SQL Alchemy isn't working for polars.

It's a shame because I really want to get it loading from SQL server on our project as we've got some pandas bottlenecks.

We typically connect with pyodbc, and I think we use Kerberos as our authenticator.

@wangxiaoying any news on when this feature/bugfix might be implemented?

Hi, I got it working by creating the connection string as below: mssql://{server_name}/{database_name}?trusted_connection=true

mitches-got-glitches commented 1 year ago

Typically we add a few additional connection arguments when we connect through SQLAlchemy engine.

{
    "auth": "KERBEROS", 
    "trustServerCertificate": "yes", 
    "TrustedCerts": "path/to/kerberos/cert.pem",
    "multiSubnetFailover": "yes", 
    "loginTimeout": "yes", 
}

I've tried "mssql:{server}:{port}/{db}?trusted_connection=true" and "mssql:{server}:{port}/{db}?trusted_connection=true&encrypt=true" and I've tried to add all of those additional connection args to the URI as well but they all yield timed out in bb8. I'm not totally sure how important case is for the keys though, and whether "yes" should be converted to lower case true.

Any thoughts?

mmcdermott commented 1 year ago

I'm having a similar issue, so any updates here would be greatly appreciated!

mmcdermott commented 1 year ago

One possible note to help debug this. When I run

In [11]: query="SELECT TOP 1 * FROM Inovalon.dbo.provider"; conn="mssql://{server}:{port}/{database}?encrypt=true&trusted_connection=true"

In [12]: df = cx.read_sql(conn, query)
[2023-08-24T15:31:49Z DEBUG connectorx::pandas] Protocol: binary
[2023-08-24T15:31:49Z DEBUG connectorx::dispatcher] Run dispatcher
[2023-08-24T15:31:49Z DEBUG connectorx::dispatcher] Fetching metadata
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[12], line 1
----> 1 df = cx.read_sql(conn, query)

File ~/.conda/envs/Inovalon/lib/python3.10/site-packages/connectorx/__init__.py:224, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
    221 except ModuleNotFoundError:
    222     raise ValueError("You need to install pandas first")
--> 224 result = _read_sql(
    225     conn,
    226     "pandas",
    227     queries=queries,
    228     protocol=protocol,
    229     partition_query=partition_query,
    230 )
    231 df = reconstruct_pandas(result)
    233 if index_col is not None:

I don't see a debug log entry for mssql auth through trusted connection which based on https://github.com/sfu-db/connector-x/blob/main/connectorx/src/sources/mssql/mod.rs#L70 I'd expect to see. So I'm not sure it is actually using trusted connection accurately?

mmcdermott commented 1 year ago

@wangxiaoying As I think there are a few issues now that are related to this, and I think it may be the case that the failure is that trusted connections are not being applied, any chance you can take another look at this? I'm not a Rust expert so am not sure if the lack of the debug line I highlighted above is actually telling, and I couldn't get the package to build in full source on my local machine to test it with more debug statements, but my hope is that that points to the underlying issue here.

farmerofpeace commented 6 months ago

@mmcdermott I was used to setting trusted_connection=yes but it only worked for me when I did trusted_connection=true. Maybe that's your issue?

However, when I try to connect to a specific instance of a microsoft sql server, I get the connection timeout bb8 again. I am trying like this. Any recommendations on how to make it work?

conn = f'mssql://{server}/{instance}:1433/{database}?trusted_connection=true'

example: mssql://myServer/SQL2:1433/myDB?trusted_connection=true

mmcdermott commented 6 months ago

@farmerofpeace unfortunately I haven't been able to get this to work. If you're seeing the timeout issue I suspect it is the same underlying problem here as others and I have experienced. @wangxiaoying I don't know if you'd had a chance to look at this, but my speculation remains that the trusted connection is not actually being applied in the right way in this setting (but I have no real expertise on the Rust or trusted connections side, so it could definitely be a different problem).